A Bayesian approach to compatibility, improvement, and pooling of quantum states 



M. S. Leifer 

Department of Physics and Astronomy, University College London, 
Gower Street, London WC1E 6BT, United Kingdon^\ 

Robert W. Spekkens 

Perimeter Institute for Theoretical Physics, 31 Caroline St. N., Waterloo, Ontario, Canada, N2L 2Y^\ 

(Dated: October 3, 2011) 

In approaches to quantum theory in which the quantum state is regarded as a representation 
of knowledge, information, or belief, two agents can assign different states to the same quantum 
system. This raises two questions: when are such state assignments compatible? and how should 
the state assignments of different agents be reconciled? In this paper, we address these questions 
from the perspective of the recently developed conditional states formalism for quantum theory 
1]. Specifically, we derive a compatibility criterion proposed by Brun, Finkelstein and Mermin 
from the requirement that, upon acquiring data, agents should update their states using a quantum 
generalization of Bayesian conditioning. We provide two alternative arguments for this criterion, 
based on the objective and subjective Bayesian interpretations of probability theory. We then apply 
the same methodology to the problem of quantum state improvement, i.e. how to update your state 
when you learn someone else's state assignment, and to quantum state pooling, i.e. how to combine 
the state assignments of several agents into a single assignment that accurately represents the views 
of the group. In particular, we derive a pooling rule previously proposed by Spekkens and Wiseman 
under much weaker assumptions than those made in the original derivation. All of our results apply 
to a much broader class of experimental scenarios than have been considered previously in this 
context. 
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I. INTRODUCTION 

In Bayesian probability theory, probabilities represent 
an agent's information, knowledge or beliefs; and hence 
it is possible for two agents to assign different probability 
distributions to one and the same quantity. Recently, due 
in part to the emergence of quantum information theory, 
there has been a resurgence of interest in approaches to 
quantum theory that view the quantum state in a simi- 
lar way I2h13|. and in such approaches it is possible for 
two agents [56j to assign different quantum states to one 
and the same quantum system (henceforth, to avoid rep- 
etition, the term "state" will be used to refer to either 
a classical probability distribution or a quantum state). 
One way this can arise is when the agents have access 
to differing data about the system. For example, in the 
BB84 quantum key distribution protocol 14|, Alice, hav- 
ing prepared the system herself, would assign one of four 
pure states to the system, whereas the best that Bob can 
do before making his measurement is to assign a maxi- 
mally mixed state to the system. This naturally leads to 
the question of when two state assignments are compati- 
ble with one another, i.e. when can they represent validly 
differing views on one and the same system? 

The meaning of "validly differing view" depends on the 
interpretation of quantum theory and, in particular, on 
the status of the quantum state within it. If the quan- 
tum state is thought of as being analogous to a Bayesian 
probability distribution, then the meaning of "validly dif- 
fering view" also depends on precisely which approach to 
Bayesian probability one is trying to apply to the quan- 
tum case. In the Jaynes-Cox approach [ill [l6j], some- 
times called objective Bayesianism, states are taken to 
represent objective information or knowledge and, given 
a particular collection of known data, there is assumed 
to be a unique state that a rational agent ought to as- 
sign, often derived from a rule such as the Jaynes max- 
imum entropy principle. In contrast, in the de Finetti- 
Ramsey-Savage approach [l~7l - [2l1 ] , often called subjective 
Bayesianism, states are taken to represent an agent's sub- 
jective degrees of belief and agents may validly assign dif- 
ferent states to the same system even if they have access 
to identical data about the system. This is due to differ- 
ing prior state assignments, the roots of which are taken 
to be unanalyzable by the subjective Bayesian. 

In its modern form, the problem of quantum state com- 
patibility was first tackled by Brun, Finkelstein and Mer- 
min (BFM) [22h25| . although this work was motivated 
by earlier concerns of Peierls [26], [27j . BFM provide a 
compatibility criterion for quantum states on finite di- 
mensional Hilbert spaces. Mathematically, the criterion 
is that two density operators are compatible if the inter- 
section of their supports is nontrivial. In particular, the 



BFM criterion implies that two distinct pure states are 
never compatible, so that if any agent assigns a pure state 
to the system then any other agent who wishes to assign 
a compatible pure state must assign the same one. In 
the special case of commuting state assignments, it also 
implies the classical criterion for compatibility of proba- 
bility distributions on finite sample spaces, which is that 
there must be at least one element of the sample space 
that is in the support of both distributions. 

To date there have been two types of argument given 
for requiring the BFM compatibility criterion: one due to 
BFM themselves [HJ (an argument that takes a similar 
point of view was later developed by Jaco bs [281 ]) and one 
due to Caves, Fuchs and Schack (CFS) [IfTAlthough 
not explicitly given in Bayesian terms, the BFM argu- 
ment has an objective Bayesian flavor in that it assumes 
that there is a unique quantum state that all agents 
would assign to the system if they had access to all the 
available data. On the other hand, the CFS argument is 
an attempt to give an explicitly subjective Bayesian ar- 
gument for the BFM compatibility criterion. Both argu- 
ments start from lists of intuitively plausible criteria that 
state assignments should obey, but, in our view, a more 
rigorous approach is needed in order to correctly gener- 
alize the meaning that compatibility has in the classical 
case. 

Classically, there are two arguments for compatibility 
depending on whether one adopts the objective or the 
subjective approach. In both cases, compatibility is de- 
fined in terms of the rules that Bayesian probability the- 
ory lays down for making probabilistic inferences, and, 
in particular the requirement that, upon learning new 
data, states should be updated by Bayesian condition- 
ing. The reason for demanding an argument based on a 
well-defined methodology for inference is that there are 
situations in which even a Bayesian would want to up- 
date their state assignment by means other than Bayesian 
conditioning. For example, if you discover some informa- 
tion that is better represented as a constraint than as the 
acquisition of new data, such as finding out the mean en- 
ergy of the molecules in a gas, then minimization of rel- 
ative entropy, rather than Bayesian conditio ning , would 
commonly be used to update probabilities [3(J[3l[. Argu- 
ments have also been made for applying generalizations of 
Bayesian conditioning, e.g. Jeffrey conditioning [32l |33|. 
on the acquisition of new data in certain circumstances. 
It is not clear whether the intuitions used by BFM and 
CFS are applicable to all such circumstances and indeed 
our intuitions about probabilities and quantum states are 
not all that reliable in general. It is therefore important 
to be clear about the type of inference procedures that 
are being allowed for in any argument for a compatibility 
condition. 

What is missing from the existing arguments for BFM 
compatibility is a specification of precisely what sorts 
of probabilistic inferences are valid — in short, a pre- 
cise quantum analog of Bayesian conditioning. We have 
recently proposed such an analog within the formalism 
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of conditional quantum states This formalism has 
the advantage of being more causally neutral than the 
standard quantum formalism, by which we mean that 
Bayesian conditioning is applied in the same way regard- 
less of how the data is causally related to the system 
of interest, e.g. the data could be the outcome of a di- 
rect measurement of the system, a variable involved in 
the preparation of the system, the outcome of a mea- 
surement of a remote system that is correlated with the 
system of interest, etc. This causal neutrality allows us to 
develop arguments that are applicable in a broader range 
of experimental scenarios — or more accurately, causal 
scenarios — than those obtained within the conventional 
formalism. 

In this article, we derive BFM compatibility from the 
principled application of the idea that, upon learning new 
data, agents should update their states according to our 
quantum analogue of Bayesian conditioning. This leaves 
no room for other principles of a more ad hoc nature. 
Both objective and subjective Bayesian arguments are 
given by first reviewing the corresponding classical com- 
patibility arguments and then drawing out the parallels 
to the quantum case using conditional states. The BFM- 
Jacobs and CFS arguments are then criticized in the light 
of our results. 

Having dealt with the question of how state assign- 
ments can differ, we then turn to the question of how 
to combine the state assignments of different agents. In 
Bayesian theory, the purpose of states is to provide a 
guide to rational decision making via the principle of 
maximizing expected utility. In its usual interpretation, 
this is a rule for individual decision making that does not 
take into account the views of other agents. This raises 
two conceptually distinct problems. 

Firstly, decision making should be performed on the 
basis of all available relevant evidence. The fact that an- 
other agent assigns a particular state could be relevant 
evidence, and may cause you to change your state as- 
signment, even in the case where both state assignments 
are the same. For example, if both you and I assign the 
same high probability to some event, then telling you my 
state assignment may cause you to assign an even higher 
probability if you believe that my reasons for assigning 
a high probability are valid and that they are indepen- 
dent of yours. Following Herbut J34|, we call updating 
your state assignment in light of another agent's state 
assignment state improvement. 

Secondly, if two agents do have different state assign- 
ments, then they may have different preferences over the 
available choices in decision making scenarios. In prac- 
tice, decisions often have to be made as a group, in which 
case a preference conflict prevents all the agents in the 
group from maximizing their individual expected utili- 
ties simultaneously. This motivates the need for meth- 
ods of combining state assignments into a single assign- 
ment that accurately represents the beliefs, information, 
or knowledge of the group as a whole. This problem is 
called state pooling. 



In the classical case, both improvement and pooling 
have been studied extensively (see [35[ and [36[ for re- 
views) . From this it is clear that there is no hope of com- 
ing up with a universal rule, applicable to all cases, that 
is just a simple functional of the different state assign- 
ments. Instead, we offer a general methodology for com- 
bining states, in both the classical and quantum cases, 
again based on the application of Bayesian conditioning. 



Learning another agent's state assignment can be 
thought of as acquiring new data. Therefore, given 
our Bayesian methodology, the state improvement prob- 
lem is solved by simply conditioning on this data. For 
state pooling, we adopt the supra- Bayesian approach 
37], which requires the agents to put themselves in the 
shoes of a neutral decision maker. Although their abil- 
ity to do this is not guaranteed, doing so reduces the 
pooling problem to an instance of state improvement, 
i.e. the neutral decision maker's state is conditioned on 
all the other agents' state assignments and the result is 
used as the pooled state. As with compatibility, our ap- 
proach to these problems is to draw out the parallels to 
the classical case using conditional states and to derive 
our results by a principled application of Bayesian condi- 
tioning. This is an improvement over earlier approaches 
[HI Hi| [H, [33, [H, |39| , which use more ad hoc principles. 
However, some of the results of these earlier approaches 
are recovered within the present approach. In particu- 
lar, a pooling rule previously proposed by Spekkens and 
Wiseman 

.10] 

can be derived from our method in the spe- 
cial case where the minimal sufficient statistics for the 
data collected by different agents satisfy a condition that 
is slightly weaker than conditional independence. This 
is an improvement on the original derivation, which only 
holds for a more restricted class of scenarios. 



The results in this paper can be viewed as a demon- 
stration of the conceptualpower of the conditional states 
formalism developed in [lj. However, two concepts that 
were not discussed in [l| are required to develop our ap- 
proach to the state improvement and pooling problems. 
These are quantum conditional independence and suffi- 
cient statistics. Conditional independence has previously 
been studied in [401 ] , from which we borrow the required 
results. Several definitions of quantum sufficient statis- 
tics have been given in the literature (4l| - |43l |. but they 
concern sufficient statistics for a quantum system with 
respect to a classical parameter, or sufficient statistics 
for measurement data with respect to a preparation vari- 
able. By contrast, here we need sufficient statistics for 
classical variables with respect to quantum systems. Our 
treatment of this is novel to the best of our knowledge. 
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II. REVIEW OF THE CONDITIONAL STATES 
FORMALISM 

A. Basic concepts 

The conditional states formalism, developed in [l|, 
treats quantum theory as a generalization of the classical 
theory of Bayesian inference. In the quantum generaliza- 
tion, classical variables become quantum systems, and 
normalized probability distributions over those variables 
become operators on the Hilbert spaces of the systems 
that have unit trace but are not always positive. The 
generalization is summarized in table HI the elements of 
which we now review. The treatment here is necessarily 
brief. A more detailed development of the formalism and 
its relation to the conventional quantum formalism can 
be found in 

Note that we adopt the convention that classical vari- 
ables are denoted by letters towards the end of the alpha- 
bet, such as R, S, T, X, Y and Z, while quantum systems 
are denoted by letters near the beginning of the alphabet, 
such as A, B and C. 

In the classical theory of Bayesian inference, a joint 
probability distribution P(R, S) describes an agent's 
knowledge, information or degrees of belief about a pair 
of random variables R and S. There is no constraint on 
the interpretation of what the two variables can repre- 
sent. They may refer to the properties of two distinct 
physical systems at a single time, or to the properties of 
a single system at two distinct times, or indeed to any 
pair of physical degrees of freedom located anywhere in 
spacetime. They may even have a completely abstract 
interpretation that is independent of physics, e.g. R 
could represent acceptance or rejection of the axioms of 
Zermelo-Fraenkel set theory and S could be the truth 
value of the Reimann hypothesis. However, given that 
we are interested in quantum theory, such abstract inter- 
pretations are of less interest to us than physical ones. 
The main point is that the same mathematical object, a 
joint probability distribution P(R,S), is used regardless 
of the interpretation of the variables in terms of physical 
degrees of freedom. 

The theory of quantum Bayesian inference aims to 
achieve a similar level of independence from physical in- 
terpretation. In particular, we want to describe infer- 
ences about two systems at a fixed time via the same 
rules that are used to describe a single system at two 
times. As such, the usual talk of "systems" in quantum 
theory is inappropriate, as a system is usually thought 
of as something that persists in time. Instead, the basic 
element of the conditional states formalism is a region. 
An elementary region describes what would normally be 
called a system at a fixed point in time and a region is 
a collection of elementary regions. For example, whilst 
the input and output of a quantum channel are usually 
thought of as the same system in the conventional for- 
malism, they correspond to two disjoint regions in our 
terminology. This gives a greater symmetry to the case 



of two systems at a single time, which also correspond to 
two disjoint regions. 

A region A is assigned a Hilbert space %a and a com- 
posite region AB consisting of two disjoint regions, A 
and B, is assigned the Hilbert space Hab = Ha <8> Ub- 
The knowledge, information, or beliefs of an agent about 
AB are described by a linear operator on T-Lab (this op- 
erator has other mathematical properties which will be 
discussed further on). This operator is called the joint 
state and, for the moment, we denote it by tab- Ideally, 
one would like this framework to handle any set of re- 
gions, regardless of where they are situated in spacetime, 
but unfortunately the formalism developed in [l| is not 
quite up to the task. For instance, it is currently unclear 
how to represent degrees of belief about three regions 
that describe a system at three distinct times. 

In a classical theory of Bayesian inference, one also 
has the freedom to conditionalize upon any set of vari- 
ables, regardless of the spatio-temporal relations that 
hold among them, or indeed of the spatio-temporal re- 
lations between the conditioning variables and the con- 
ditioned variables. Therefore, this is an ideal to which a 
quantum theory of Bayesian inference should also strive. 
Again, the formalism of [l| does not quite achieve this 
ideal. For instance, this framework cannot currently deal 
with pre- and post-selection, for which the conditioning 
regions straddle the conditioned system in time. 

Whilst these sorts of consideration limit the scope of 
our results, we are still able to treat a wide variety of 
causal scenarios including all those that have been pre- 
viously discussed in the literature on compatibility, im- 
provement, and pooling. We begin by providing a synop- 
sis of the formalism as it has been developed thus far|57f. 

Table U summarizes the basic concepts and formulas of 
this framework and defines the terminology that we use 
for them. 

For an elementary region A, the quantum analogue of a 
normalized probability distribution is a trace-one opera- 
tor ta on Ha- For a region AB, composed of two disjoint 
elementary regions, the analogue of a joint distribution 
P(R,S) is an operator tab on Hab- The marginaliza- 
tion operation P(S) = J2 R P(R, S) which corresponds 
to ignoring R, is replaced by the partial trace operation, 
tb = Tta (tab), which corresponds to ignoring A. The 
role of the marginal distribution P(S) is played by the 
marginal state tb ■ 

If A is an elementary region, then ta is also positive, 
and simply corresponds to a conventional density opera- 
tor on A. To highlight this fact, we denote it by pa in 
this case. The positivity of marginal states on elemen- 
tary regions implies that the joint state tab of a pair of 
elementary regions must have positive partial traces (but 
it need not itself be a positive operator). 

Another key concept in classical probability is a con- 
ditional probability distribution P(S\R). P(S\R) repre- 
sents an agent's degrees of belief about S for each pos- 
sible value of R. It satisfies ^2 a P(S = s l-^ = r) = 
1 for all r and is related to the joint probability by 
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Classical 


Quantum 


State 

Joint state 
Marginalization 


P(R) 
P(R,S) 


TA 
TAB 

tb = TrA (tab) 


Conditional state 


P(S\R) 

j2 s p ( s \ R ) = 1 


Tb\A 
m f \ r 

Tr B (t b \a) = U 


Relation between joint and 
conditional states 


P(R,S) = P(S\R)P(R) 
P(S\R) = P(R,S)/P(R) 


TAB = T B \A *TA 
Tb\A = TAB *T^ 


Bayes' theorem 


P(R\S) = P(S\R)P(R)/P(S) 


TA\B = T B \ A * (TATg 1 ) 


Belief propagation 


P(S) = J2 R P(S\R)P(R) 


tb = Tr A (t b \ata) 



TABLE I: Analogies between the classical theory of Bayesian inference and the conditional states formalism for quantum theory. 



P{S\R) = P(R,S)/P(R). This implies Bayes' theorem, 
P(R\S) = P(S\R)P(R)/P(S), which allows condition- 
als to be inverted. Conditional probabilities are critical 
to probabilistic inference. In particular, if you assign 
the conditional distribution P(S\R) and your state for 
R is P(R), then your state for S can be computed from 
p (S) = T,r P(S\R)P(R). This map from P(R) to P(S) 
is called belief propagation. 

The quantum analogue of a conditional probability is 
a conditional state for region B given region A. This 
is a linear operator on %ab, denoted t b \a, that satisfies 
Tr_B (t B \ a ) = Ia- It is related to the joint state by t B \ A = 
tab * t a 1 , where the *- product is defined by 

M*N = N 1/2 MN 1/2 , (1) 

and we have adopted the convention of dropping identity 
operators and tensor products, so that t A b*t a is short- 
hand for TAB + iT^ 1 ®7 B ) = [t a 1/2 ®I b )tab{ta 1/2 ®Ib)- 
The quantum analogue of Bayes' theorem, relating t b \a 

and ta\b, is t a\b = t b\a * ( t at £ > 1 ). Conditional states 
are the key to inference in this framework. In partic- 
ular, if you assign the conditional state t b \a and your 
state for A is ta, then your state for B can be computed 
from t b = Tr A (t B \a t a) , where we have used the cyclic 
property of the trace. This map from ta to t b is called 
quantum belief propagation. 

B. The relevance of causal relations 

The rules of classical Bayesian inference are indepen- 
dent of the causal relationships between the variables un- 
der consideration. For instance, the formula for belief 
propagation from R to S does not depend on whether 
R and S represent properties of distinct systems or of 
the same system at two different times. Nonetheless, 



causal relations between variables can affect the set of 
probability distributions that are regarded as plausible 
models. For example, if T is a common cause of R and 
S, then R and S should be conditionally independent 
given T, i.e. any viable probability model should satisfy 
P{R,S\T) = P(R\T)P(S\T). 

In the quantum case, the situation is similar. The rules 
of inference, such as the formula for belief propagation, 
do not depend on the causal relations between the regions 
under consideration, but causal relations do affect the set 
of operators that can describe joint states. Indeed, the 
dependence is stronger in the quantum case because the 
kind of operator used depends on the causal relation even 
for a pair of regions. 

Suppose that A and B represent elementary regions. 
A and B are causally related if there is a direct causal 
influence from A to B (for instance, if A and B are the 
input and the output of a quantum channel), or if there 
is an indirect causal influence through other regions (for 
instance, there is a sequence of channels with A as the 
input to the first and B as the output of the last). A 
and B are acausally related if there is no such direct or 
indirect causal connection between them, for instance, if 
they represent two distinct systems at a fixed time. 

If A and B are acausally related, then their joint state 
tab is a positive operator. It simply corresponds to a 
standard density operator for independent systems. The 
conditionals t a \b an d t B \ A are then also positive opera- 
tors. Given that p is the standard notation for density 
operators, a joint state of two acausally related regions 
is denoted pab- Similarly, the conditional states are de- 
noted Pa\b an d Pb\a- This notation is meant to be a 
reminder of the mathematical properties of these opera- 
tors. We refer to them as acausal (joint and conditional) 
states. 

If A and B are causally related, then t ab does not 
have to be a positive operator, but t ab (or equivalently 
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t ab ) is always positive, where Ta and Tb denote partial 
transpose operations on A and i?[58j. Similarly, ta\b and 
t~b\a are not necessarily positive, but they must have 
positive partial transpose. In this case, the operators 
tab,ta\b and t B \ A are denoted qab,Qa\b and q B \a re- 
spectively and we refer to them as causal (joint and con- 
ditional) states. In particular, dynamical evolution tak- 
ing pa to pb can be represented as quantum belief prop- 
agation using a causal conditional state Qb\a, i-e. Pb = 
TrA \Qb\aPa) ■ If) in the conventional formalism, the 
dynamics would be described by a Completely Positive 
Trace-preserving (CPT) map Z B \A ■ ^{'Ha) -> £(H B ), 
then the corresponding conditional state Qb\a is the op- 
erator on Ha ®T~Lb that is Jamiolkowski-isomorphic [44j | 
to £ B \a, that is, qb\a = T,j,k U) ( fc U ® d fc ) 01a')> 
where %yi< is isomorphic to 'Ha and {\j}} is any orthonor- 
mal basis for Ha- 

C. Modeling classical variables 

Joint, marginal and conditional classical probability 
distributions are special cases of joint, marginal and con- 
ditional quantum states. To see this, note that a ran- 
dom variable i?, with da possible values, can be asso- 
ciated with a d,R dimensional Hilbert space with a pre- 
ferred basis {\t\) R ,\r 2 ) R , ■ ■ ■ ,\rd R ) R } labeled by the pos- 
sible values of R. Then, a probability distribution P(R) 
can be encoded in a density operator that is diagonal 
in this basis via tr = ^2 r P(R = r) \r) (r\ R . Simi- 
larly, for two random variables, R and S, we can con- 
struct Hilbert spaces and preferred bases for each and 
encode a joint distribution P(R, S) in a joint state via 
ms = E r . s P(R = r ,S = s) \r) (r\ R ®\s) (s\ s , and a con- 
ditional distribution P(S\R) in a conditional state via 
ts\r = E r ,s P(S = s\R = r) \r) (r\ R ® \s) {s\ s . 

Because all operators on a given classical region com- 
mute, the ^-product reduces to the regular operator 
product for classical states, so that the formulas for 
quantum Bayesian inference reduce to their classical 
counterparts. For instance, the quantum Bayes' theo- 
rem becomes r R \g = Tgi R TRTg , which is equivalent to 
P(R\S) = P(S\R)P(R)/P(S). 

Note that if we adopt the convention that partial trans- 
poses on classical regions arc always defined with respect 
to the preferred basis, then classical joint and conditional 
states are invariant under this operation. Therefore, clas- 
sical causal states have the same mathematical proper- 
ties as classical acausal states. (59|. Since the notational 
distinction between p and g is supposed to act as a re- 
minder of the mathematical difference between causal 
and acausal states for pairs of quantum regions, there 
is no need to make the distinction for classical states. 
We therefore adopt the convention of denoting classical 
states over an arbitrary set of regions by p, regardless of 
how the regions are causally related. 

To complete our discussion of the basic objects in the 
conditional states formalism, we need to describe how 



correlations between classical and quantum regions can 
be represented. The classical variable X is represented 
by a Hilbert space Hx with a preferred basis, as de- 
scribed above, and the quantum region A is associated 
with a Hilbert space Ha with no preferred structure. 
The hybrid region XA is assigned the Hilbert space 
Hxa = Hx ® Ha, but in representing correlated states 
on this space, we must ensure that the classical part 
remains classical. In particular, this means that there 
can be no entanglement between X and A, and that 
the reduced state on X must be diagonal in the pre- 
ferred basis. This motivates defining a hybrid quantum- 
classical operator on Hxa to be an operator of the form 
M X a = Y. x \x) (x\ x O M X=X ,A, where each M X=X ,A is 
an operator on Ha- The operators Mx= x ,A are called 
the components of Mxa- 

It follows that a hybrid joint state has the form txa = 
12 x \ x ) ( x \x ® T x=x,A, where each component tx= x ,A is 
an operator on Ha- Recall that if X and A are acausally 
related, then txa must be positive, while if X and A are 
causally related, then Txa must be positive. However, 
given the form of a hybrid state, txa is positive if and 
only if Txa ^ s positive, so the two conditions are equiva- 
lent. Consequently, causal and acausal states on hybrid 
regions correspond to the same set of operators. There- 
fore, as for classical states, p is used to denote all hybrid 
states, regardless of their causal interpretation. 

By calculating the marginal state px and pA from the 
hybrid state pax, we can define conditional states as 

Px\a = Pax * Pa 1 and p A \x = Pax * Px 1 = PaxPx 1 - 
In the latter case, the ^-product reduces to the regu- 
lar operator product because X is classical. There are 
two sorts of conditional states for hybrid systems cor- 
responding to whether the quantum or the classical re- 
gion is on the right of the conditional. If the condition- 
ing system is quantum, then the conditional state has 
the form p X \ A = J2 X \ x ) ( x \x ® Px=x\A where p x=x \A 
is positive and Px=x\A = I A- It follows that the 
set of operators {px=x\a} is a Positive Operator Valued 
Measure (POVM) and therefore such conditional states 
can be used to represent measurements, a fact that we 
shall make use of in mil Al If the conditioning sys- 
tem is classical, then the conditional state has the form 
PA\x = J2 x PA\x=x ® \x) (x\ x where pa\x= x is positive 
and Tr^ (pa\x=x) = 1 f° r all x. The operators {pa\x=x} 
therefore constitute a set of normalized states on A, and 
can therefore be used to represent state preparations, a 
fact that will also be used in mil Al 



D. Bayesian conditioning 

Classically, if you are interested in a random variable 
R, and you learn that a correlated variable X takes the 
value x, then you should update your probability dis- 
tribution for R from the prior, P(R), to the posterior, 
P(R\X — x). This is known as Bayesian Conditioning. 

In the conditional states formalism, whenever there is 
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a hybrid region, regardless of the causal relationship be- 
tween the classical variable X and the quantum region 
A, you can always assign a joint state pxA- When you 
learn that X takes the value x, the state of the quantum 
region should be updated from pa to Pa\x=x- This is 
quantum Bayesian conditioning. 



E. How to read this paper 

This article is mainly concerned with the consequences 
of conditioning a quantum region on classical data, so the 
main objects of interest are hybrid conditional states with 
classical conditioning regions. In this case the set of oper- 
ators under consideration does not depend on the causal 
relation between the two regions. However, thus far we 
have only considered conditioning a quantum region on a 
single classical variable. Suppose instead that you learn 
the values of two classical variables, X\ and X2, and you 
want to update your beliefs about a quantum region A. 
In this case, there are some causal scenarios where your 
beliefs cannot be correctly represented by a joint state 
Pax ± x 2 - I n such scenarios, our results do not apply. 

To properly explain the distinction between the types 
of causal scenario to which our results apply and those to 
which they do not requires delving into the conditional 
states formalism in more detail. However, this extra ma- 
terial is not necessary for understanding most of our re- 
sults, so the reader who is eager to get to the discussion of 
compatibility, improvement and pooling can skip ahead 
to ^IV[ referring back to £11111 as necessary. 

The next section covers the required background for 
understanding the scope of our results and gives several 
examples of experimental scenarios to which our results 
apply. In particular, all of the causal scenarios that have 
been considered to date in the literature on compatibility, 
improvement, and pooling are within the scope of our 
results. Indeed, given that all previous results have been 
derived in the context of specific causal scenarios, our 
results represent a substantial increase in the breadth of 
applicability, even if they do not yet cover all conceivable 
cases. 



III. MODELING EXPERIMENTAL SCENARIOS 
USING THE CONDITIONAL STATES 
FORMALISM 

Table |H] translates various concepts and formulas from 
the conventional quantum formalism into the language of 
conditional states. These correspondences are described 
in more detail in [lj. The meaning of most of the rows 
should be evident from the discussion in the previous sec- 
tion, and the rest are explained in this section as needed. 

We begin by showing how conditioning a quantum re- 
gion on a single classical variable works in several differ- 
ent experimental scenarios. This is necessary background 
knowledge for considering the more relevant scenarios in- 
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(b) 




FIG. 1: Quantum-classical hybrid regions with different 
causal relations. Triangles represent classical variables (as 
suggested by the shape of the probability simplex) and cir- 
cles represent quantum regions (as suggested by the spherical 
state space of a qubit). (a) Preparation procedure: a quan- 



tum region B is prepared in one of a set of states depending 
on the value of a classical variable X (B is in the causal future 
of X). |(b) I Remote measurement: a measurement is made on 
A, which is acausally related to B. The classical outcome 
X is then acausally related to B. (c) Measurement: a mea- 
surement is made on a quantum region B and the classical 
variable X represents the outcome (X is in the causal future 
of B). 



volving conditioning on a pair of variables. The differ- 
ent experiments correspond to different causal structures, 
which arc illustrated by directed acyclic graphs. 



A. Conditioning on a single classical variable 

In this section, the quantum region we are interested 
in making inferences about is always denoted B and the 
classical variable on which the inference is based is de- 
noted X. 

Example HI A. Consider the following preparation pro- 
cedure. A classical random variable X with proba- 
bility distribution P(X) is generated by flipping coins, 
rolling dice or any other suitable procedure, and then 
a quantum region is prepared in a state p^ depend- 
ing on the value of X obtained. This scenario is de- 
picted in fig. Hal Suppose that, initially, you do not 
know the value of X that was obtained in this proce- 
dure. In the conditional states formalism, your beliefs 
about X are represented by a diagonal state px with 
components px=x = P(X = x). The set of states pre- 
pared is represented by a conditional state Pb\x with 
components Pb\x=x = Px- Since the ^-product reduces 
to a regular product for classical states, the joint state 
of XB is pxB = Pb\xPx- In terms of components, this 
is Pxb — J2 X P(X — x) \x) {x\ x (g> pB- It follows that 
Pxb contains sufficient information to describe an en- 
semble of states, i.e. a set of states supplemented with a 
probability distribution over them. Tracing over X gives 
the marginal p B = Tr x (pb\xPx) = J2 X P ( X = x )Px 1 
which is easily recognized as the ensemble average state 
on B. 

According to the conventional formalism, upon learn- 
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Conventional Notation 


Conditional States Formalism 


Probability distribution of X 
Probability that X — x 


P(X) 
P(X = x) 


Px 

pX=x 


Set of states on A 
Individual state on A 


{Px} 

A 
Px 


PA\X 
PA\X=x 


POVM on A 
Individual effect on A 


{Ex} 

E x 


PX\A 
PX=x\A 


Channel from A to B 


£b\a 


Qb\a 


1 1"! L_' t I'll 111 1 1 1 1 t 

Individual Operation 


r f B\A, 
pB\A 


QXB\A 
Qx=x,B\A 


The Born rule 


Vx : P(X = x) = Tr A (E^p A ) 


Px = Tr A (px\aPa) 


Ensemble averaging 




Pa = Trx (pa\xPx) 


Action of a quantum channel 


Pb = £b\a (pa) 


Pb = Tr.4 (qb\aPa) 


Composition of channels 


£c\A = £c\B ° £b\A 


Qc\a — Tr B (qc\bQb\a) 


State update rule 


Vx : P(X = x)p* = £x lA (Pa) 


Pxb = Tr.4 (qxb\aPa) 



TABLE II: Translation of concepts and equations from conventional notation to the conditional states formalism. 



ing that X takes the value x, you should assign the state 
that was prepared for that particular value of X to B, 
which is just p^ . However, since Pb\x=x = Px m the 
conditional states formalism, this update has the form 
Pb — > Pb\x=x, so it is an example of quantum Bayesian 
conditioning. The interpretation of conditioning in this 
scenario is as an update from the ensemble average state 
to a particular state in the ensemble. 

Example III. 2. Suppose that A and B are two acausally 
related quantum regions to which you assign the state 
Pab- The (prior) reduced state on B is ps = Tr^ (pab)- 
Now suppose that you make a measurement on A with 
outcome described by the variable X and that the mea- 
surement is associated with a POVM {E^}. In the condi- 
tional states formalism, the measurement is represented 
by a conditional state px\Ai where px=x\A = E£. We 
are interested in how the state for B gets updated upon 
learning the outcome x of X . This causal scenario is de- 
picted in fig. Ilbl This is the scenario that occurs in the 
EPR experiment, or more generally in "quantum steer- 
ing" . The update map in this case is sometimes called a 
"remote collapse rule" . 

In the conditional states formalism, the joint state on 
XB can be determined by belief propagation from A to 
X, i.e. pxb = ^a(px\aPab)- The marginal on X 
gives the outcome probabilities for the measurement and 
is given by px = Tr^ (pbx)- From these, the conditional 



state Pb\x is determined via ps\x = PbxP~x ■ By sub- 
stituting X = x into the expression for pb\x, w e obtain 
Pb\x=x- This is the state that you should assign to B 
when you learn that X = x, i.e. the update rule for the re- 
mote region is just Bayesian conditioning ps — > Pb\x=x- 
The updated state Pb\x=x can be expressed in terms of 
the givens in the problem, i.e. the state pab and the 
POVM elements E A , but this is not especially instruc- 
tive for present purposes. Interested readers can consult 
[l|, where it is shown that this form of Bayesian condi- 
tioning is precisely the same as the usual remote collapse 
rule in the conventional formalism. 

Example III. 3. Consider the case where X represents the 
outcome of a direct measurement made on B and you 
want to condition the state of B on the value of this out- 
come. This causal scenario is depicted in fig. [Tel an d is 
described by an input state ps and a conditional state 
Px\b with components given by the POVM that is being 
measured. The conditional pb\x=x is then the X = x 
component of Pb\x^ which can be computed from an 
application of Bayes' theorem ps\x = Px\b * {pbPx 1 )^ 
where px — Tr^ \Px\bPb)- The operator Pb\x=x is the 
state that should be assigned to region B upon learning 
that the outcome X takes the value x. 

Note that Bayesian conditioning in this case is a kind 
of retrodiction: the region being conditioned upon, the 
outcome of the measurement, is to the future of the con- 
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FIG. 2: Causal scenario for describing measurement update 
rules, or quantum instruments. A represents the system be- 
fore the measurement, X is the measurement outcome and B 
is the system after the measurement has been completed. 

ditioned region, the quantum input to the measurement. 
This application of Bayesian conditioning to retrodiction 
is discussed in detail in |l| and is shown to generate pre- 
cisely the same operational consequences as would be ob- 
tained in the conventional formalism for retrodiction. 

Example III. 4. Finally, consider a direct measurement 
again, but where the region of interest is the quantum 
output of the measurement rather than its input. Let A 
and B denote the input and output respectively. Since 
these are distinct regions, they must be given distinct 
labels in the conditional states formalism, whereas con- 
ventionally they would be given the same label as they 
represent the same system at two different times. The 
classical variable representing the outcome is X. We are 
interested in how the state of B should be updated upon 
learning the value of X. The relevant causal structure 
is depicted in fig. [2j The causal arrow from X to B 
represents the fact that the post-measurement state can 
depend on the measurement outcome in addition to the 
pre-measurement state. 

In general, the rule for determining the state of the re- 
gion after the measurement, given the state of the region 
before the measurement and the outcome, is not uniquely 
determined by the POVM associated with the measure- 
ment. The most general possible rule is conventionally 
represented by a quantum instrument, which is a set of 

B I A 

trace- nonincreasing completely positive maps, {£ x }. 

The operation £x maps a pre-measurement state pa 
to the unnormalizcd post-measurement state that should 
be assigned when the outcome is x, i.e. £x^ A {pa) = 
P(X — x)pB, where P(X = x) is the probability of 
obtaining outcome x and p x is the normalized post- 
measurement state. This implies that if a measurement 
is associated with a POVM {E^}, then the quantum in- 
strument must satisfy Tr# \£x^ A (pa)^ = Tr^ (E^Pa) 
for all input states pa- 

It is not too difficult to see how to represent a quantum 
instrument in the conditional states formalism. First, 
note that the measurement generates an ensemble of 
states for B, i.e. for each possible outcome X = x there 
is a probability P(X — x), given by the Born rule, and 
a corresponding state p x for B, which is the state that 



should be assigned to B when the outcome X = x oc- 
curs. We have already seen that an ensemble of states 
can be written as a joint state pxb of the hybrid region 
XB via pxb = Y,x P ( X = x ) \ x ) ( x \x ® Px ■ What is 
needed then, is a way of determining a joint state Pxb 
of XB, given a state pa of region A. Perhaps unsur- 
prisingly, this can be done by specifying a causal condi- 
tional state qxb\a an d using belief propagation to obtain 
Pxb = Tr^ (qxb\aPa) ■ The POVM that is measured by 
this procedure is given by the components of the condi- 
tional state qx\a — Tre (qxb\a)- The precise relation 
between the instrument {£x^ A } and the causal condi- 
tional state qxb\a is obtained through the Jamiolkowski 
isomorphism and is described in 

If you assign a prior state pa to the region before 
the measurement, and describe the quantum instru- 
ment implementing the measurement by Qxb\a, then 
the ensemble of output states is described by pxb — 
Tr^ (pxb\aPa)- The marginal state p B = Tr x (pxb) is 
then your prior state for the output region and px = 
Tre (pxb) gives the Born rule probabilities for the mea- 
surement outcomes. The states in the ensemble, pb\x=x, 
can then be computed from the conditional Pb\x — 
PxbPx 1 - Upon learning that X — x, you should up- 
date your beliefs about B by Bayesian conditioning, i.e. 
by the rule p B -)• p B \x=x- 

Note that Bayesian conditioning is not a rule that 
maps your prior state about the measurement's input 
to your posterior state about the measurement's output, 
which would be a map of the form pa — > Pb\x=x- The 
projection postulate is an instance of this latter kind of 
update, but it is not an instance of Bayesian condition- 
ing. Bayesian conditioning is a map from prior states 
to posterior states of one and the same region. The map 
Pb — > Pb\x=xi which takes the prior state of the measure- 
ment's output to the posterior state of the measurement's 
output is an instance of quantum Bayesian conditioning. 
In the conventional formalism it corresponds to a transi- 
tion from the output of a non-selective state-update rule, 
which you would apply when you know that a measure- 
ment has occurred but not which outcome was obtained, 
to the output of the corresponding selective state-update 
rule, which applies when you do know the outcome. 



B. Conditioning on two classical variables 

The problems discussed in this paper concern infer- 
ences made by multiple (typically two) agents based on 
different data. Thus, we are interested in conditioning a 
quantum region on the values of more than one classi- 
cal variable, which may or may not be known to all the 
agents. 

It is convenient to introduce a few more notational 
conventions to handle such scenarios. Since we are us- 
ing letters to denote regions, we use numbers to refer to 
agents. Given that regions A and B are prominent in 



our article, it is confusing to use the usual names Alice 
and Bob for our numbered agents, so we refer to agent 
1 as Wanda and agent 2 as Theo. Occasionally, we will 
refer to a decision-maker, whom we call Debbie, and for 
which we use the number 0. A classical variable that 
agent j learns during the course of their inference pro- 
cedure is denoted Xj . The quantum region about which 
the agents are making inferences is denoted B, and, when 
making analogies between quantum theory and classical 
probability theory, the classical variable analogous to B 
is denoted Y . Any other auxiliary quantum regions in- 
volved in setting up the causal scenario are denoted A 
(or Ai, A2, . . . if there is more than one of them) and 
auxiliary classical variables are denoted Z (or Zi, Z2, ■ ■ ■ 
if there is more than one of them) . 

Depending on the causal relations between the classical 
variables Xj and an elementary quantum region B, it is 
possible to construct scenarios in which the available in- 
formation about the quantum region cannot be summed 
up by the assignment of a single state (positive density 
operator) to the region. For example, this is familiar in 
the case of pre- and post-selected ensembles, which are 
described by a pair of states rather than a single state 
in the formalism of Aharonov et. al. [45). Although our 
results apply to a much wider variety of causal scenarios 
than those typically discussed in the literature on com- 
patibility, improvement, and pooling, we still do not con- 
sider situations in which the region of interest has to be 
described by a more exotic object than a single quantum 
state. Of course, a general quantum theory of Bayesian 
inference should be able to address such scenarios, but 
that is a topic for future work. 

Mathematically speaking, our results apply whenever 
the following condition holds: 

Condition III. 5. The joint region consisting of the 
quantum region of interest, B, and all the classical vari- 
ables involved in the inference procedure, X\, X2 ■ ■ ., can 
be assigned a joint state pBX t x 2 ... (which may be either 
an acausal or a causal state). 

Consider the case of two classical variables, X\ and X2 , 
and suppose that a joint state pbXiX 2 exists. From this, 
one can compute the reduced states pb, Px ± and px 2 , an d 
the joint states pbXi, Pbx 2 and Px x x 2 - From these, one 
can easily compute the conditional states Pb\X!1 Pb\x 2 
and Pb\x 1 x 2 - If Wanda learns that X\ = x\ then she 
updates ps to her posterior state Pb\x 1 =x 1 , and if Theo 
learns that Xi = X2 then he updates pb to his poste- 
rior state Pb\x 2 =x 2 - An agent who learns both outcomes 
would update to Pb\x 1 =x 1 ,x 2 =x 2 ■ The existence of the 
joint state px t x 2 B ensures that all the posterior states 
Pb\x x = Xi , Pb\x 2 =x 2 and p B \x 1 =x 1 ,x 2 =x 2 are well defined. 
Similar comments apply when there are more than two 
classical variables. 

In the remainder of this section, we give several ex- 
amples of causal scenarios in which this condition does 
apply, in order to emphasize the generality of our re- 
sults, and we provide some examples where it does not, 
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(a) (b) (c) 



FIG. 3: Introducing an extra classical variable to the causal 
scenarios depicted in fig. [T] via post-processing. 

to clarify the limitations to their applicability. All the 
examples involve inferences about a quantum region B 
based on two classical variables X\ and X^- 

1. Examples of causal scenarios in which a joint state can 
be assigned 

Example III. 6. Perhaps the simplest class of causal sce- 
narios in which a joint state can be assigned are those 
in which the second variable X2 is obtained via a post- 
processing of the variable Xi , i.e. X2 is obtained from X\ 
via conditional probabilities P(X2\Xi), or equivalently a 
classical conditional state Px 2 \X!- Only X\ is directly 
related to the quantum region B and any correlations 
between X2 and B are mediated by X±. Examples of 
this sort of causal scenario are depicted in fig. [3] 

In all these scenarios, we already know from mil Al that 
BX\ can be assigned a joint state pbx x and then the joint 
state of BX1X2 is just 

Pbx x x 2 — Px 2 \x 1 Pbx 1 , (2) 

so condition IIII.5I is satisfied. These examples are im- 
portant because they imply that arbitrary classical pro- 
cessing may be performed on a classical variable without 
changing our ability to assign a joint state. In particular, 
this is used in i jVBI where hybrid sufficient statistics are 
defined as a kind of processing of a classical data variable. 




FIG. 4: Wanda and Theo learn variables that are correlated 
with a variable used to prepare region B. 

Example III. 7. Consider a generalization of the prepara- 
tion scenario depicted in fig. [Ta|to the scenario depicted 
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in fig. |H which adds two further classical variables that 
depend on the preparation variable. In this scenario, a 
classical random variable Z is sampled from a probabil- 
ity distribution P(Z) and, upon obtaining the outcome 
Z — z, a region B is prepared in the state Pb\z=z- Some 
data about Z is revealed to the two agents: X\ to Wanda, 
and X 2 to Theo. X\ and X 2 may be coarse-grainings of 
Z, or they may even depend on Z stochastically. For 
example, if Z is the outcome of a dice roll, then X\ and 
X 2 could both be binary variables, with X\ indicating 
whether Z is odd or even and X 2 indicating whether it 
is < 3. Generally, the dependence of Xi and X 2 on Z is 
given by classical conditional states px ± \z an d Px 2 \z- A 
joint state for X\X 2 B can be defined in this case via 

Px x x 2 b = Tr z {pxx\zPx 2 \zPb\zPz) , (3) 
so, again.condition Hll.51 is satisfied. 




FIG. 5: Wanda and Fheo learn about B by making measure- 
ments on two acausally related regions A\ and A^. 

Example III. 8. Consider the generalization of the remote 
measurement scenario depicted in fig. Ilbl to a pair of re- 
mote measurements, as depicted in fig. [5] This scenario 
is in fact the one that is adopted in much of the literature 
on compatibility and pooling [ljj HH HH . The region of 
interest, B, is acausally related to two other quantum 
regions, A\ and A 2 , so we have a tripartite state pa x a 2 b- 
Direct measurements are made on A\ and A 2 , with out- 
comes X\ and X 2 respectively, and which are described 
by the conditional states Px x \A x an d Px 2 \A 2 respectively. 
It is assumed that Wanda learns only X\ and Theo learns 
only X 2 . In this case, we can define a tripartite acausal 
state by 

Px x x 2 b = Tr Al A 2 (px 1 \a 1 Px 2 \a 2 Pa 1 a 2 b) ■ (4) 

Example III. 9. Consider a generalization of the direct 
measurement scenario depicted in fig. [Tc] to the scenario 
of fig. IH1 which introduces two further classical variables 
that depend on the measurement result. This is similar 
to the second example considered in this section except 
that, rather than Z being used to prepare B, it is now ob- 
tained by making a direct measurement on B, described 
by the conditional state pz\B- As before, some informa- 
tion about Z is distributed to each agent, specifically, 
variables X\ and X 2 to Wanda and Theo respectively. 
The dependence of X\ and X 2 on Z is again described 




FIG. 6: Wanda and Theo learn variables derived from a direct 
measurement made on region B. 



by conditional states Pxaz and px 2 \z- In this case, a 
joint state px t x 2 B can be defined as 

Px 1 x 2 b = Tr z (p Xl \zPx 2 \z [pz\b* Pb\) , (5) 

and conditioning on values of the classical variables yields 
states that are relevant for retrodiction. 




FIG. 7: Wanda and Theo learn the results of two measure- 
ments preformed in sequence. 

Example III. 10. Consider a generalization of the mea- 
surement scenario depicted in fig. [5] to a case where a 
pair of measurements are implemented in succession, as 
depicted in fig. [7] This scenario has been considered in 
the context of compatibility and pooling by Jacobs [28j . 
as discussed in ^IV C 21 The input region of the first mea- 
surement is denoted A\ . The output of the first measure- 
ment, which is also the input of the second, is denoted A 2 , 
and the output of the second measurement, which is the 
region about which Wanda and Theo seek to make infer- 
ences, is denoted by B. The classical variables describing 
the outcomes of the two measurements are denoted X\ 
and X 2 respectively, and it is assumed that Wanda learns 
X\ while Theo learns X 2 . 

Suppose that Wanda and Theo agree on the input state 
PAx and on the causal conditional states, QxxA 2 \Ai an d 
Qx 2 b\a 2 i that describe the measurements. A joint state 
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can then be assigned to X1X2B via 

PX\XiB = Tr^iAa (qx 2 b\a 2 Qx 1 a 2 \a 1 Pa 1 ) ■ (6) 

The interpretation of eq. ([5]) is that the two consecutive 
measurements can be thought of as a preparation pro- 
cedure for B that prepares the states in the ensemble 
Px 1 x 2 b depending on the values of X\ and X 2 . 

These examples should serve to give an idea of the type 
of scenarios to which our results apply. 

2. Examples of causal scenarios in which a joint state 
cannot be assigned 




FIG. 8: Wanda learns a preparation variables and Theo learns 
a measurement variable. Learning both variables gives a pre- 
and post-selected ensemble. 

Example III.ll. Consider the prepare- and- measure sce- 
nario depicted in fig. [8j Here, B is prepared in a state 
depending on the preparation variable Xi and then B 
is measured, resulting in the outcome X 2 . More con- 
cretely, consider the case where B is a qubit prepared in 
the {|0} B , |1} B } basis and measured in the {\+} B , — ) B } 
basis, where |±) = (|0) ± |1}). Suppose that X 2 takes 

the value X 2 = for \+) B and X 2 = 1 for \—) B . Al- 
though it is possible to assign joint states to X\B and to 
X 2 B, the conditional states that these assignments imply 
are not compatible with any joint state for X\X 2 B. 

To see this, note that the joint states for X\B and X 2 B 
have to be of the form 

PXlB = P(X 1 = 0) |0> (0\ Xi ® |0) (0| B 

+ P(X 1 = l)|l)(l| Xl ®|l)(l| B (7) 

PX2 B = P(X 2 - 0) |0> (0| X2 ®|+) (+| B 

+ P(X a = l)\l){l\ Xa ®\-)(-\ B , (8) 

where P(Xi) is the distribution of the preparation vari- 
able and P(X 2 ) is the Born rule probability distribution 
for the outcomes of the measurement. 



Then, Pb\x 1 =x 1 is a definite state in the {|0} B , |1) B } 
basis and Pb\x 2 =x 2 is a definite state in the {|+) B , |— ) B } 
basis. Any putative Pb\x 1 =x 1 ,x 2 =x 2 i derived from a joint 
state of all three regions, would then have to have definite 
values for measurements in both the {|0) B , |1) B } basis 
and in the {|+) B , |— } B } basis. There is no state with this 
property because these are complimentary observables. 

Conditioning on both X\ = x\ and X 2 — x 2 repre- 
sents a case of pre- and post-selection, and, as argued by 
Aharonov et. al. [45l |. the concept of a quantum state has 
to be generalized in order to handle such cases. 




FIG. 9: Learning both the outcome of a direct measurement 
and the outcome of a remote measurement. 



Example III. 12. Consider two acausally related quantum 
regions A and B. Here, B is the region of interest, but di- 
rect measurements are made on both A and B, resulting 
in the classical variables X\ and X 2 respectively. This is 
depicted in fig. [9] Formally, this is very similar to pre- 
and post-selection and a joint state of X\X 2 B is ruled 
out for similar reasons. 

Suppose that A and B are qubits and that pab — 
I* - } ($~\ab i s a singlet state, where \ 1 &~) AB = 
( 1 01) — |10)) j4B . If X\ is the result of a measurement 

of A in the {10}^ , |1)^} basis and X 2 is the result of mea- 
suring B in the {|+} B , |— } B } basis then, as before, the 
state Pb\Xx=x\ would have to be a definite state in the 
{|0) B , |1) B } basis and Pb\x 2 =x 2 would have to be a def- 
inite state in the {|+) B , |— ) B } basis. The putative joint 
state would then have to have a conditional with com- 
ponents Pb\x 1 =x 1 ,x 2 =x 2 that are definite in both bases, 
which is not possible in the formalism as it currently 
stands. 



IV. COMPATIBILITY OF QUANTUM STATES 

This section describes our Bayesian approach to the 
compatibility of quantum states. We give alternative 
derivations of the BFM compatibility criterion from the 
point of view of objective and subjective Bayesianism. In 
each case, we begin by reviewing the corresponding ar- 
gument in the classical case in order to build intuition, 
and draw out the parallels to the quantum case using the 
conditional states formalism. 
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A. Objective Bayesian compatibility 

First consider compatibility for a classical random vari- 
able Y. For the objective Bayesian, the only way that two 
agents' probability assignments can differ is if they have 
had access to different data, so suppose Wanda learns 
the value of a random variable X\ and Theo learns the 
value of a different random variable X2. According to 
the objective Bayesian, there is a unique prior proba- 
bility distribution P(Y, X\, X2) that both Wanda and 
Theo ought to initially assign to the three variables before 
they have observed the values of the Xj's. Both Wanda 
and Theo's prior distribution for Y alone is simply the 
marginal P(Y) = Y^Xi x 2 P^Xi X\,Xz). Upon observing 
a particular value Xj of Xj, Wanda and Theo update to 
their posterior distributions P(Y\Xj = Xj). 

Now suppose that we don't know the details of how 
Wanda and Theo arrived at their probability assignments 
and we are simply told that, at some specific point in 
time, Wanda assigns some distribution Qi(Y) to Y and 
Theo assigns a distribution Q 2 (Y) (different from Qi(Y) 
in general). For the objective Bayesian, this can only 
arise in the manner described above, so the notion of 
compatibility is defined as follows. 

Definition IV. 1 (Classical objective Bayesian compati- 
bility). Two probability distributions Q%(Y) and Q 2 (Y) 
are compatible if it is possible to construct a pair of 
random variables, X\ and X 2 , and a joint distribu- 
tion P(Y, Xi,X 2 ) such that Qi(Y) can be obtained by 
Bayesian conditioning on X\ — x\ for some value x%, 
and Q 2 (Y) can be obtained by Bayesian conditioning on 
X2 = X2 for some value x 2 , that is, 

Q j (Y)=P(Y\X j = Xj ) (9) 

for some values Xj of Xj. Further, we require that 
P(Xi = xi,X2 — X2) 7^ so that there is a possibil- 
ity for both outcomes to be obtained simultaneously. 

This definition of compatibility is equivalent to the re- 
quirement that the supports of Qi(Y) and Q2(Y) have 
nontrivial intersection, where the support of a prob- 
ability distribution P(Y) is defined as supp[P(Y")] = 
{y\P(Y = y)>0}. 

Theorem IV. 2. Two distributions Qi(Y) and Q 2 (Y) 
satisfy definition UV . 1[ i.e. they are compatible in the ob- 
jective Bayesian sense, iff they share some common sup- 
port, i.e. 

supp [Qi{Y)\ n supp [Q 2 (Y)} + 0. (10) 

The proof makes use of the following lemma. 

Lemma IV. 3. If a probability distribution P(X, Y) sat- 
isfies P(X = x) ^ then supp [P(Y\X = x)] C 
supp [P(Y)}. 

Proof. The condition P(X = x) ^ implies that P(Y 
•I X = x) is well defined for all y. Let ker [P(Y")] — 



{y I P(Y = y) = 0}, i.e. ker [P(F)] is the complement of 
supp[P(y)]. Let y £ ker[P(F)]. Since P{Y = y) = 0, 
we have £3 , P(Y — y, X = x') = 0, which implies that 
P(Y = y,X = x') = for every value x' and conse- 
quently that P(Y = y\X = x) = 0. In other words, 
y e ker [P (Y)] implies y G ker [P (Y\X = x)], which 
means that ker[P(F)] C ker[P(y|A = x)], or equiva- 
lently supp[P(F|A = x)j C supp[P(F)]. □ 

Proof of theorem \IV.2\ 
The "only if" half: 

It is given that there is a joint distribution P (Y, X\, X2) 
such that Qj(Y) = P{Y\X j =x j ). Since P(Xi = 
xi,X 2 — x 2 ) ^ 0, P(F|Ai = xi, X 2 — X2) exists and, by 
lemma |IV.3[ it must satisfy 

supp [P{Y\X 1 =x u X2 = X2)] C 

swpp[P{Y\Xx=xi)\ (11) 

supp [P(Y\X 1 =x 1 ,X 2 = x 2 )} C 

supp [P (Y\X 2 = X2)] . (12) 

Since every probability distribution has nontrivial sup- 
port, supp[P 0^|A"i = xi, X2 = X2)} 7^ 0, so 

supp [P <y\X x = Xl )] n supp [P (Y\X 2 = x 2 )] + 0. (13) 

The "if" half: 

Given that Qi(Y) and Q 2 (Y) have intersecting support, 
one can find a normalized probability distribution Qo (Y) 
such that 

Q i (Y)= Pl Q (Y) + (l-p 1 )Q' 1 (Y), (14) 
Q2(Y)=p 2 Qo(Y) + (l-p 2 )Q / 2(Y), (15) 

where Q' 1 (Y) and Q' 2 {Y) are each normalized probability 
distributions and < p\,P2 < 1- 

This decomposition can be used to construct two ran- 
dom variables, X\ and X 2 , and a joint distribution 
P(Y,X 1 ,X 2 ) such that P(Ai = x 1: X 2 = x 2 ) + and 
Qj(Y) = P (Y\Xj — Xj) for some values x\ and x 2 . Let 
X\ and X 2 be bit-valued variables that take values {0, 1}, 
and define 

P(y|A 1= 0,A 2 = 0) = QoCn (16) 
P(Y\X 1 = 0,X 2 = 1) = Q' 1 (Y) (17) 
P(Y\X 1 = 1,X2 = 0) = Q' 2 (Y). (18) 

The result of conditioning on (X\ = 1,X 2 = 1) can be 
taken to be an arbitrary distribution, denoted by N(Y), 
i.e. 

P(Y\X 1 = 1,X 2 = 1) = N(Y). (19) 
Next, define the following distribution over X\ and A 2 : 



P{X X 


= o,x 2 


= 0) 


= PlP2 


P{Xt 


= 0,A 2 


= 1) 


= (1 -Pl)p% 


P{Xi 


- i,x 2 


= 0) 


= P1 (1 -P2) 


P{Xi 


= i,x 2 


= 1) 


= (1- Pl )(l-P2 
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Using these, we can define P(Y,Xx,X 2 ) = 
P(Y\X 1 ,X 2 )P(X 1 ,X 2 ). It is straightforward to 
verify that this satisfies P(X\ = 0, X 2 — 0) = p\p 2 > 
and that P (Y\X 1 = 0) and P (Y\X 2 = 0) are equal to 
the right-hand sides of eqs. (fT4| and (fl"5)) . Consequently, 
they are equal to Qi(Y) and Q 2 (Y) respectively. □ 



The "only if" part of the proof of theorem [TV72] estab- 
lishes that intersecting supports is a necessary require- 
ment for objective Bayesian state assignments. On the 
other hand, the "if" part only establishes sufficiency for 
causal scenarios that support generic joint states. For a 
given causal scenario, i.e. a given set of causal relations 
holding among Y, X\ and X 2 , there may be restrictions 
on the joint probability distributions that can arise. As 
an extreme example, if the causal structure is such that 
the composite variable YX\ and the elementary vari- 
able X 2 are neither connected by some direct or indirect 
causal influence, nor connected by a common cause, then 
they will be statistically independent and the joint distri- 
bution will factorize as P(Y,X U X 2 ) = P(Y, X-l)P(X 2 ). 
Under such a restriction, there are certain pairs of states 
Qi(Y) and Q 2 (Y) that have intersecting support, but 
Wanda and Theo could never come to assign them by 
conditioning on X\ and X 2 . For instance, in the exam- 
ple just mentioned, Q 2 (Y) must be equal to the prior 
over Y and consequently, by lemma llV.31 the only pairs 
Qi(Y) and Q 2 (Y) that can arise by such conditioning 
are pairs for which the support of Qi(Y) is contained in 
that of Q 2 (Y), Therefore, not every pair of compatible 
state assignments will arise in a given causal scenario. On 
the other hand, in "generic" scenarios wherein the causal 
structure does not force any conditional independences in 
the joint distribution over Y, X\ and X 2 , the "if" part 
of the proof does establish that any pair of states with 
intersecting support can arise as the state assignments of 
a pair of objective Bayesian agents. 

Turning now to the quantum case, consider a quan- 
tum region B with Hilbert space TLb- For the objective 
Bayesian the only way that two agents' state assignments 
to B can differ is if they have access to different data. We 
represent this data by two random variables X\ and X 2 , 
where Wanda has access to X\ and Theo has access to 
X 2 . Assume that the causal scenario of the experiment 
can be described by a joint state on the hybrid region 
BX\X 2l as discussed in ^III Bl 

Given that this is an objective Bayesian approach, be- 
fore Wanda and Theo observe the values of the Xj's, 
there is a unique prior state PBX t x 2 which they should 
both assign, the prior state for B alone being ps = 
r Frx 1 x 2 (PBXiX 2 )- After Wanda and Theo observe the 
values Xj for Xj they update their states for B to the 
posteriors PB\x j= x r 

Now suppose that we don't know the details of how 
Wanda and Theo arrived at their state assignments and 
we are simply told that, at some specific point in time, 
Wanda assigns a state to B and Theo assigns a state 



Bayesian, this can only arise in the manner described 
above, so the condition for compatibility is that it should 
be possible to construct a hybrid state pbXi_x 2 over B 
and two classical random variables X\ and X 2 such that 



Pb\Xj 



for some values 



Definition IV. 4 (Quantum objective Bayesian compat- 
ibility). Two quantum states o~ B ^ and a B of a region 
B are compatible if it is possible to construct a pair of 
random variables X\ and X 2 and a hybrid state PBXiX 2 
such that iri 1 ' can be obtained by Bayesian conditioning 

(2) 

on X\ = x\ for some value x\, and o~ B can be obtained 
by Bayesian conditioning on X 2 = x 2 for some value x 2 , 
i.e. 



a 



& - n 

B — PB\X j =x j 



(24) 



for some values Xj of Xj 



Further, we require that 
Px 1 =x 1 ,x 2 =x 2 7^ so that there is a possibility for both 
outcomes to be obtained simultaneously. 

This holds whenever the BFM compatibility condition 
is satisfied, as the following theorem demonstrates. Re- 
call that the support of a state ps is the span of the 
eigenvectors of ps associated with nonzero eigenvalues. 
We denote it by supp [pb\- 

Theorem IV. 5. Two quantum states <7 B ^ and a B ^ of 
a region B satisfy definition \IV-4\ i- e. they are compat- 
ible in the objective Bayesian sense, iff they share some 
common support, i.e. 



supp a B ^ n supp cr B 



(25) 



(T B (different from in general). For the objective that a ( B = Pb\x =x for some values 



where H indicates the geometric intersection of the sub- 
spaces. 

The proof of this theorem closely resembles the proof 
of its classical counterpart. First, note the quantum ana- 
logue of lemma IIV.3I 

Lemma IV. 6. If a hybrid state pxB satisfies px=x ^ 
then supp [p B \x=x] Q supp [p B ]- 

Proof. The condition px=x ^ implies that ps\x=x is 
well defined. Let ker[ps] = B \ ps B — Oh i- e - 
ker [pb] is the orthogonal complement of supp [pb\- Let 
\ij)) B E ker[p B ]. Then Q)\ B Tr x (p BX ) W) B = 0. This 
implies that (ipls Pb,x=x' \ip) b ~ for every x' be- 
cause each operator pb,x=x' is positive. Consequently, 
(MbPb\x=x \i>) B = °- In other words, if \ip) B G 
ker [pb] then \ifj) B G ker [p_b|x=z] > which means that 
ker [p B ] C ker [ps|x=x] , or equivalently supp [pB|x=J Q 
supp[p B ]- □ 

Proof of theorem \IV.5[ 
The "only if" half: 

It is given that there is a hybrid joint state pbx x x 2 such 



U) 



lj of Xj. Since 
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Px 1 =x 1 ,x 2 =x 2 / 0, the condi tional state Pb\x 1 =x 1 ,x 2 =x 2 
is well defined. Lemma IIV.6I then implies that 

SUpp [pBlX^xuX^xz] C supp [p S |x 1 =x 1 ] (26) 
supp [p B \x 1 =x 1 ,x 2 =x 2 ] C supp [p B \x 2 =x 2 ] ■ (27) 

Since Pb\x 1 =x 1 ,x 2 =x 2 has nontrivial support, it follows 
that 

SUpp [/9B|X 1= xi] SU PP [PSIXa^^a] ^ 0- (28) 

The "if" half: 

Given that <jg and have intersecting support, one 
can find a quantum state ps such that 

<J ( b ] =PiPb + (1-Pi)v { b\ (29) 
4 2) =P2Ms + (l-P2)^ 2) , (30) 

where 77^ and are each quantum states and < 

Pl,P2 < 1- 

This decomposition can be used to construct two clas- 
sical variables, X\ and Xi and a hybrid state Pbx^x 2 

such that px l= x u x 2 =x 2 ^ and a)p = p B \x j =x :l for 
some values x\ and X2- Let X\ and A 2 be bit-valued 
variables, and define 

Pb\x 1= q,x 2 =o = Pb (31) 

Pb\x 1= v,x 2 =i = W ( 32 ) 

Pb\x 1= i,x 2 =o = Vb ■ (33) 

The result of conditioning on (Xi = 1,X 2 = 1) can be 
taken to be an arbitrary state, denoted vb, i.e. 

Pb\x 1= i,x 2 =i = v B - (34) 

Next, define the following (classical) state over X\ and 
X 2 : 

Px u x 2 =P\P2 |00) (00| XiX2 

+ (l-pi)p 2 |01)(01| XiX2 
+ Pl (l-p 2 )|10)(10| M 
+ (l-p 2 )(l-p 2 )|ll)(ll| XiX3 . (35) 

This can be combined with the conditional states defined 
above to obtain 

PBXxX-2 — Pb\x 1 x 2 Px 1 x 2 

= P1P2 (pb ®|00) <00| M ) 

+ (1-Pi)pa ® |01) (01\ XlX3 ) 

+ Pi(l-p 2 )(^ 2) ®|10)(10| XiX2 ) 

+ (1 -p 2 ) (1 - pa) (u B ® |11) <ll| XiX J . (36) 

It is then easy to verify that p_B|jfi=o and Ps|x 2 =o are 
equal to the right-hand sides of eqs. and (j3"U)) . and 

consequently are equal to <7g and cr^ respectively. □ 



As noted in the classical case, the "if" part of the 
proof only establishes sufficiency of the BFM criterion for 
causal scenarios that support generic joint states. Cer- 
tain causal scenarios may enforce a restriction on the 
pairs of states and that Wanda and Theo can 
come to assign by conditioning on X\ and X 2 . For in- 
stance, consider the causal scenarios depicted in fig. [3l 
where X2 is obtained by post-processing of X\, so that 
all correlations between X 2 and B are mediated by X\. 
In this case, the only pairs and a^ 1 that can arise 
by conditioning on X\ and Xi are those for which the 

support of Of} is contained in that of er^' . We are led 
to the same conclusion as we found classically: although 
BFM compatibility is necessary in any causal scenario, 
not every pair of BFM compatible state assignments can 
arise in every causal scenario. Nonetheless, we can always 
find a causal scenario wherein there are no restrictions on 
the joint state psx x x 2 and therefore no restriction on the 
states to which a pair of agents can be led by Bayesian 
conditioning. The causal scenario considered by BFM, 
where X\ and X2 are the outcomes of a pair of remote 
measurements on B (depicted in fig.[S|) is one such exam- 
ple, as is the causal scenario considered by Jacobs, where 
X\ and Xi are the outcomes of a sequential pair of mea- 
surements and B is the output (depicted in fig. [5]) [60]. 

B. Subjective Bayesian compatibility 

A subjective Bayesian cannot use the approach just 
discussed in general, since it depends on postulating a 
unique prior state over B, X\, and X 2 (or Y, X\, and 
X2 in the classical case) that all agents agree upon before 
collecting their data. Given that the choice of prior is an 
unanalyzable matter of belief for the subjectivist, there 
is no reason why Wanda and Theo need to agree on a 
prior at the outset and, further, there is no reason why 
the difference in their probability assignments has to be 
explained by their having had access to different data in 
the first place. If it happens that Wanda and Theo did 
have a shared prior before collecting their data then the 
argument runs through, but for the subjective Bayesian 
this is the exception rather than the rule. In fact, since 
subjective Bayesians do not rule out as irrational the pos- 
sibility of agents starting out with contradictory beliefs, 
it might seem that there is no role for compatibility cri- 
teria in this approach at all. 

However, this is not the case since, although subjec- 
tive Bayesians do not analyze how agents arrive at their 
beliefs, they are interested in whether it is possible for 
them to reach inter-subjective agreement in the future, 
i.e. whether it is possible for them to resolve their dif- 
ferences by experiment or whether their disagreement is 
so extreme that one of them has to make a wholesale 
revision of their beliefs in order to reach agreement. In 
the classical subjective Bayesian will therefore 

say that two probability assignments Q\{Y) and Q 2 fK) 
to a random variable Y are compatible if it is possible 
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to construct an experiment, for which Wanda and Theo 
agree upon a statistical model, i.e. a likelihood function 
P(X\Y), such that at least one outcome X — x of the 
experiment would cause Wanda and Theo to assign iden- 
tical probabilities when they update their probabilities 
by Bayesian conditioning. In other words, the subjective 
Bayesian account of compatibility is in terms of the pos- 
sibility of future agreement, in contrast to the objective 
Bayesian account, which relies on a guarantee of agree- 
ment in the past [6l| 



tributions to 



P J (Y = y'\X = 0) 



P(X = 0\Y = y')Q J {Y = y') 
J2 y ,P(X = 0\Y = y>)Q j (Y = y>) 

= <W, (43) 

which is independent of j and hence brings them into 
agreement. □ 



Definition IV. 7 (Classical subjective Bayesian compati- 
bility). Two probability distributions, Qi(Y) and Q2(Y), 
are compatible if it is possible to construct a random 
variable X and a conditional probability distribution 
P(X\Y) (often called a likelihood function in this con- 
text) such that there exists a value x of X for which 
J2 Y P(X = x\Y)Qj(Y) ^0 and 



P 1 (Y\X = x) = P 2 (Y\X = x) 



where Pj(Y\X) = P (X\Y) Q^Y)/ £ y P (X\Y) Q,(Y). 

It turns out that the mathematical criteria that Q\(Y) 
and Q 2 (Y) must satisfy in order to be compatible in this 
subjective Bayesian sense are precisely the same as those 
required for objective Bayesian compatibility. 

Theorem IV. 8. Two probability distributions Q\(Y) 
and Q 2 (Y) satisfy definition \IV. 7[ i.e. they are compati- 
ble in the subjective Bayesian sense, iff they share some 
common support, i.e. 



supp [Qi(y)] nsupp [QaQO] ¥> 



(38) 



Proof. 



In the quantum case, if Wanda and Theo assign states 
(i) 

o~g to a quantum region then they are compatible if 
there is some classical data X that they can collect about 
the system, for which Wanda and Theo agree upon a 
statistical model, such that observing at least one value 
x of X would cause their state assignments to become 
identical. 



(37) Definition IV. 9 (Quantum subjective Bayesian com- 



patibility). Two states o~^) and a B ' are compatible if 
it is possible to construct a random variable X and a 
conditional state p x \B (which we call a likelihood oper- 
ator) such that there exists a value x of X for which 

Tr B (px=x\b°~b ) ) / and 



(2) 



(1) _ (2) 

Pb\x=x ~ Pb\x=x 



(44) 



where p B \ x =x ^ s §i ven by the quantum Bayes' theorem: 



l J B\X= 



x = (yPX=x\B *<*BJ / Tl B {^PX=x\BO- 



Once again, subjective Bayesian compatibility has the 
same mathematical consequences as its objective coun- 
terpart. Both are equivalent to requiring the BFM crite- 



The "only if" half: 

Since Pj(Y\X — x) is derived from QjiY) by Bayesian 
conditioning, it follows from lemma IIV.3I that 



supp [Pi (Y\X 
supp [P 2 {Y\X 



x)] C supp [Q x (Y)] (39) 
a:)] C supp [Q 2 (Y)] . (40) 



However, by assumption, Pi (Y\X = x) = P 2 (Y\X = x), 
so the left-hand sides of eqs. (|39|) and (|40]) are equal. 
It follows that Qi (Y) and Q2(Y) have some common 
support, namely, supp [Pi (S\X = x)]. 
The "if" half: 

By assumption, there is at least one value y of Y belong- 
ing to the common support of Q\ (Y) and Q 2 (Y). Let 
X be a classical bit and define the likelihood function 

P{X = 0\Y = y) = l P(X = l\Y = y)=0 (41) 
P{X = 0\Y ^ y) = P(X = 1\Y ^ y) = 1. (42) 

If Wanda and Theo agree to use this likelihood function, 
then, upon observing X = 0, they will update their dis- 



Theorem IV. 10. Two states and a^' satisfy def- 
inition \IV.9l i.e. they are compatible in the subjective 
Bayesian sense, iff they share common support, i.e. 



(1)1 r (2) 

supp a B fl supp a B 



(45) 



where Pi denotes the geometric intersection. 
Proof. 

The "only if" half: 

Since p^ x=x is derived from a B by Bayesian condition- 
ing, it follows from lemma ITV. 61 that 



supp 
supp 



Pb\X=x 
(2) 

Pb\x=x 



C supp 
C supp 



.(2) 



(46) 
(47) 



However, by assumption, Pg) x — X = Pb\x=x> so ^ e ^ e ^~ 
hand sides of eqs. (T4"6l and (T4T1) are equal. It follows 
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that <7g^ and have some common support, namely, 



supp 



l J B\X=x 



The "if" half: 

By assumption, the supports of Cg 1 ' and have non- 
trivial intersection. It follows that there is a pure state 
\iP)b £ %b hi the common support. Let X be a classical 
bit and define the likelihood operator 



Px\b — 

|0) <0| x ® |V) (Vis + |1) (l\ x ® (Is - IV) (V| B ) ■ (48) 

If Wanda and Theo agree to use this likelihood operator, 
then, upon observing X = they will update their states 
to 

0') 

p ( ^ = _ px r lB b ^ =ww b , (49) 



Tr B [px=o\b^b 



which is independent of j and hence brings them into 
agreement. □ 



C. Comparison to other approaches 

1. Brun, Finkelstem and Mermin 

The original BFM argument [HI, which is objective 
Bayesian in flavor, is divided into arguments for the ne- 
cessity and sufficiency of their criterion. To establish 
sufficiency, they show that for any pair of state assign- 
ments that satisfy their criterion, one can find a triple 
of distinct systems, and a quantum state thereon, such 
that if Wanda measures one system and Theo another, 
then for some pair of outcomes Wanda and Theo are led 
to update their description of the third system to the 
given pair of state assignments. This is equivalent to the 
"if" part of our theorem II V. 5 1 when applied to the remote 
measurement scenario depicted in fig. [5] The argument 
provided by BFM for the necessity of their criterion is 
based on a set of reasonable-sounding requirements. For 
example, their first requirement is: 

If anybody describes a system with a density 
matrix p, then nobody can find it to be in a 
pure state in the null space of p. For although 
anyone can get a measurement outcome that 
everyone has assigned nonzero probabilities, 
nobody can get an outcome that anybody 
knows to be impossible. 

If one is adopting an approach wherein quantum states 
describe the information, knowledge, or beliefs of agents, 
then the notion of finding a system "to be in a pure 
state" is inappropriate, as emphasized by Caves, Fuchs 
and Schack [291 ] . However, even glossing over this, their 
argument does not satisfy an ideal to which a proper ob- 
jective Bayesian account of compatibility should strive, 



namely, of being justified by a general methodology for 
Bayesian inference. This ideal is illustrated by the deriva- 
tion of the objective Bayesian criterion of classical com- 
patibility presented in ^IV Al if a pair of agents obey 
the strictures of objective Bayesianism, i.e. assigning the 
same ignorance priors and updating their probabilities 
via Bayesian conditioning, then they will never encounter 
a situation in which the compatibility condition does not 
hold, and conversely if the compatibility condition holds, 
it is always possible for them to come to their state as- 
signments by Bayesian updating. 

Because we have proposed a methodology for quantum 
Bayesian inference, we can achieve this ideal in the quan- 
tum case as well. Indeed, the close parallel between the 
proofs of the classical and quantum compatibility the- 
orems demonstrates that one can achieve the ideal in 
the quantum context to precisely the same extent that it 
can be achieved in the classical context. Whilst our ar- 
gument for sufficiency of the BFM criterion (the second 
part of our proof of theorcm HV.5[) is mathematically sim- 
ilar to BFM's argument for sufficiency, it is only against 
the background of our framework of quantum conditional 
states that it becomes possible to identify the update rule 
used by Wanda and Theo as an instance of Bayesian con- 
ditioning, and thus a special case of a general methodol- 
ogy for Bayesian inference. 

A second point to note is that in our argument for the 
compatibility condition, we consider a triple of space- 
time regions that do not necessarily correspond to three 
distinct systems at a given time — the case considered 
by BFM. The causal relation between them might in- 
stead be any of those depicted in figs. [3] [71 or indeed any 
scenario wherein all the available information about the 
quantum region can be captured by assigning a single 
quantum state. Thus, our results generalize the range 
of applicability of the BFM compatibility criterion to a 
broader set of causal scenarios. 



2. Jacobs 

Jacobs [28| has also considered the compatibility of 
state assignments using an approach that is objective 
Bayesian in flavor. In his analysis, the region of interest 
is the output of a sequence of measurements made one 
after the other on the same system, and Wanda and Theo 
have information about the outcomes of distinct subsets 
of those measurements. A simple version of this scenario 
is where there is a sequence of two measurements, where 
the outcome of the first measurement is known to Wanda 
and the outcome of the second is known to Theo. This is 
just the causal scenario depicted in fig. and as empha- 
sized there, such a scenario falls within the scope of our 
approach. In the objective Bayesian framework, Wanda 
and Theo agree on the input state to the pair of measure- 
ments and they agree on the quantum instruments that 
describe each measurement. Jacobs shows that if Wanda 
and Theo's state assignments are obtained in this way, 
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then they must satisfy the BFM compatibility criterion, 
that is, he provides an argument for the necessity of the 
BFM compatibility criterion in this causal context. 

If Wanda and Theo come to their state assignments for 
B using Jacobs' scheme, then, as explained in i jlll Bl their 
prior knowledge of B and the two outcome variables, X\ 
and X2, can be described by a joint state pbx 1 x 2 - Af- 
ter observing values x\ and Xi respectively, they come to 
assign states Pb\x x =x x and Pb\x 2 =x 2 , which are derived 
from the conditional states of the joint state pBX t x 2 - 
Such a pair of states satisfies the definition IIV.4I of quan- 
tum objective Bayesian compatibility. Theorem IIV.5I 
then implies that their state assignments satisfy the BFM 
compatibility criterion. Conversely, because the set of 
joint states pbx x x 2 which can arise in this causal scenario 
is unrestricted (see footnote [60]), it also follows from 
theorem IIV.5I that for any pair of state assignments sat- 
isfying the BFM criterion, Wanda and Theo could come 
to assign those states in this causal scenario. These re- 
sults can be generalized to the case of a longer sequence 
of measurements with the outcomes distributed arbitrar- 
ily among a number of parties, which covers the most 
general case considered by Jacobs. 

To summarize, our results can be applied to Jacobs' 
scenario and we recover Jacobs' result that the BFM cri- 
terion is a necessary requirement. Furthermore, we have 
improved upon Jacobs' analysis in two ways. Firstly, 
we have shown that the BFM compatibility criterion is 
not only a necessary condition for compatibility in this 
scenario, but is sufficient as well. Secondly, our anal- 
ysis demonstrates that, just as with the scenario of re- 
mote measurements, the BFM criterion can be justified 
in the scenario of sequential measurements by insisting 
that states should be updated by Bayesian conditioning 
within a general framework for quantum Bayesian infer- 
ence. 



3. Caves, Fuchs and Schack 

In contrast to BFM and Jacobs, CFS [29[ discuss the 
problem of quantum state compatibility from an explic- 
itly subjective Bayesian point of view. They argue that 
there cannot be a unilateral requirement to impose com- 
patibility criteria of any sort on subjective Bayesian de- 
grees of belief because there is no unique prior quantum 
state that an agent ought to assign in light of a given 
collection of data. The only necessary constraint is that 
states should satisfy the axioms of quantum theory, i.e. 
they should be normalized density operators. In partic- 
ular, it should not be viewed as irrational for two agents 
to assign distinct, or even orthogonal, pure states to a 
quantum system. 

Whilst we agree with this argument, we think that 
there is still a role for compatibility criteria within the 
subjective approach. They can be viewed as a check to 
see whether it is worthwhile for the agents to engage in 
a particular inference procedure, and this is conceptually 



distinct from viewing them as unilateral requirements 
that must be imposed upon all state assignments. In 
the case of BFM compatibility, the criterion of intersect- 
ing supports is simply a check that agents can apply to 
see if it is worth their while to try and resolve their dif- 
ferences empirically by collecting more data, or whether 
their disagreement is so extreme that resolving it requires 
one or more of the agents to make a wholesale revision 
of their beliefs. From this point of view, BFM plays the 
same role as the criterion of overlapping supports does in 
classical subjective Bayesian probability. 

Despite their skepticism of compatibility criteria, CFS 
do attempt to recast the necessity part of the BFM ar- 
gument in terms that would be more acceptable to the 
subjective Bayesian, i.e. they outline a series of require- 
ments that a pair of subjective Bayesian agents may wish 
to adopt that would lead them to assign BFM compat- 
ible states. They do not provide an argument for suffi- 
ciency, so this is one way in which our argument is more 
complete. CFS's argument is quite similar to the BFM 
necessity argument, except that it is phrased in terms 
that would be more acceptable to a subjective Bayesian. 
For example, they talk about the "firm beliefs" of agents 
rather than saying that systems are "found to be" in cer- 
tain pure states. However, this line of argument is still 
open to an objection that we leveled against the BFM 
argument. In our view, compatibility criteria should be 
derived from the inference methodologies that are being 
used by the agents rather than from a list of reason- 
able sounding requirements. Another objection is that 
their argument relies on strong Dutch Book coherence, 
which is a strengthening of the usual Dutch Book coher- 
ence that subjective Bayesians use to derive the structure 
of classical probability theory. Strong coherence entails 
that if an agent assigns probability one to an event then 
she must be certain that it will occur. This is obviously 
problematic in the case of infinite sample spaces due to 
the presence of sets of measure zero and, since there is 
nothing in the Dutch Book argument that singles out fi- 
nite sample spaces, it would not usually be accepted by 
subjective Bayesians in that case either. 

Since CFS do not believe that the BFM criterion is 
a uniquely compelling requirement, they also introduce 
a number of weaker compatibility criteria based on the 
compatibility of the probability distributions obtained by 
making different types of measurement on the system. 
Three of these compatibility criteria are equivalent to the 
usual intersecting support criterion in the classical case, 
but they become inequivalent when applied to quantum 
theory. Presumably, this is supposed to cast doubt upon 
the uniqueness of BFM as a compelling compatibility cri- 
terion in the quantum case. However, in our view, the 
non-BFM criteria in the CFS hierarchy are not mean- 
ingful as compatibility criteria. To explain why, we take 
their weakest criterion — W 1 compatibility — as an ex- 
ample. 

The W criterion says that two quantum states are 
compatible if there exists a measurement for which the 
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Born rule outcome probability distributions computed 
from the two states are compatible in the classical sense, 
i.e. they have intersecting support in the set of outcomes. 
It is fairly easy to see that this places no constraint at all 
on state assignments — such a measurement can always 
be found. For example, if Wanda and Theo assign two 
orthogonal pure states then a measurement in a comple- 
mentary basis would always yield compatible probability 
distributions over the set of outcomes. CFS argue that 
Wanda and Theo could resolve their differences empir- 
ically by making such a measurement in this scenario. 
After the measurement, if both Wanda and Theo learn 
the outcome and apply the projection postulate, then 
they would end up assigning the same state to the sys- 
tem, specifically, the state in the complementary basis 
corresponding to the outcome that was observed. 

However, in our view, this does not resolve the origi- 
nal conflict between Wanda and Theo. Although Wanda 
and Theo's state assignments to the region after the mea- 
surement (its quantum output) are now identical, their 
state assignments to the region before the measurement 
(its quantum input) remain unchanged. As stated in 
mil Al and explained in more detail in [l[, the state of 
the region before the measurement updates via quantum 
Bayesian conditioning rather than by the projection pos- 
tulate. Pure states are fixed points of quantum Bayesian 
conditioning, so Wanda and Theo will always continue 
to disagree about the state of this region, whatever in- 
formation they later acquire about the region. 

The mistake that CFS have made is to think of compat- 
ibility in terms of persistent systems rather than spatio- 
temporal regions, and to think of the projection postu- 
late as a quantum analogue of Bayesian conditioning. It 
is easy to make this mistake because in a classical the- 
ory of Bayesian inference, a measurement can be non- 
disturbing. In this case, the value of the variable Y be- 
ing measured is not changed by the measurement, and 
the update rule for the probability distribution of Y can 
be understood as conditioning Y on the outcome of the 
measurement. The variable describing the system be- 
fore the measurement is the same as the one describing 
it after, so that updating your beliefs about one is the 
same as updating your beliefs about the other. But this 
is no longer true for classical measurements that disturb 
the system, and as argued in [l|, all nontrivial quantum 
measurements are analogous to these. Therefore, to high- 
light the problem with the W compatibility criterion, we 
consider what it would predict in the case of a disturbing 
classical measurement. 

Suppose the system is a coin that has just been flipped, 
but is currently hidden from Wanda and Theo. If Wanda 
believes that the coin has definitely landed heads and 
Theo believes that it has definitely landed tails, then 
their beliefs are certainly incompatible. If the coin is 
then flipped again and Wanda and Theo are shown the 
outcome of the second toss, they will agree on the cur- 
rent state of the coin, and hence their state assignments 
to the system after the observation are now compatible. 



However, because the configuration of the coin was dis- 
turbed in the process of measurement, there is no sense in 
which their disagreement about the outcome of the first 
coin flip has been resolved. Similarly, we believe that 
because nontrivial quantum measurements always entail 
a disturbance (in the sense described in [l|), coming to 
agreement about the state of the region after the mea- 
surement does not resolve a disagreement about the state 
of the region before the measurement. 

Despite our reservations about the CFS compatibility 
criteria, they are still of some independent interest. In 
particular, one of them (the PP criterion) was recently 
used in a quite different context as part of a no-go theo- 
rem for certain types of hidden variable models for quan- 
tum theory [46j . 

V. INTERMEZZO: CONDITIONAL 
INDEPENDENCE AND SUFFICIENCY 

Having dealt with state compatibility, our next task 
is to develop a Bayesian approach to combining state 
assignments. In order to do this, two additional con- 
cepts are needed: conditional independence and sufficient 
statistics, which are reviewed in this section. Quantum 
conditional independence has been studied in [401 ] . from 
which we quote results without proof. 

A. Conditional independence 

1. Classical conditional independence 

A pair of random variables R and S are conditionally 
independent given another random variable T if they sat- 
isfy any of the following equivalent conditions: 

CI1: P(S\R,T) = P(S\T) 

CI2: P(R\S,T) = P(R\T) 

CI3: I(R : S\T) = 

CI4: P(R,S\T) = P(R\T)P(S\T), 

where it is left implicit that these equations only have 
to hold for those values of the variables for which the 
conditionals are well-defined. Here, I(R : S\T) is the 
conditional mutual information of R and S given T, de- 
fined by 

I(R : S\T) = H(R, T) + H{S, T) 

— H(T) — H(R, S, T), (50) 

where H(R) = - £ r P(R = r)log 2 P(R = r) is the 
Shannon entropy of R, with the obvious generalization 
to multiple variables. Note that the conditional mutual 
information is always positive. 

Conditional independence of R and S given T means 
that any correlations between R and S are mediated, or 
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screened-off, by T. In other words, if one were to learn 
the value of T then R and S would become independent. 



2. Quantum conditional independence for acausally related 
regions 

In the quantum case, the three random variables i?, S 
and T become quantum regions with Hilbert spaces Ha, 
H b and He ■ We specialize to the case of three acausally 
related regions because the theory of conditional indepen- 
dence has not yet been developed for other causal scenar- 
ios. Prior to the introduction of conditional states, it was 
not obvious whether the conditional independence condi- 
tions [CIl] [CI2] and [CM] had quantum analogs, but lCI3l 
has a straightforward generalization where I (A : B\C) is 
now the quantum conditional mutual information defined 
as 



I(A : B\C) = S(A,C) 



S(B,C) 
-S(C) 



S(A,B,C), (51) 



where S(A) = — Tr^ (pa log pa) is the von Neumann 
entropy of the state on A. The quantum conditional 
mutual information satisfies I(A : B\C) > 0, which is 
equivalent to the strong sub-additivity inequality [47j j . 
and so the quantum conditional independence condition 
I (A : B\C) = is the equality condition for strong sub- 
additivity. 

In the conditional states formalism, there are direct 
analogs of the conditions ICI1I and ICI2I that provide an 
alternative characterization of quantum conditional in- 
dependence. 

Theorem V.l. For three acausally related regions, A, 
B and C , the following conditions are equivalent: 

QCI1: Pa\bc = Pa\c 

QCI2: pb\ac = Pb\c 

QCI3: I(A : B\C) = 



Due to these equivalences, any of |QCIl] |QCI3| can be 

viewed as the definition of quantum conditional indepen- 
dence. 

It is also true that 

Theorem V.2. If A is conditionally independent of B 
given C , then 

QCI4: pab\c = Pa\cPb\c- 

Because pab\c is self-adjoint, theorem lV.2l implies that 
Pa\c an d Pb\c must commute when A and B are con- 
ditionally independent given C. Unlike in the classical 
case, the converse of theorem IV.2I does not hold, i.e. 
Pab\c = Pa\cPb\c does not imply conditional indepen- 
dence. Extra constraints on the form of pc can be im- 
posed to yield equivalence, but these are not important 
for present purposes (see [40| for details). 



3. Hybrid conditional independence 

The case that is most relevant to the present work is 
when two classical random variables Ai and Xi are con- 
ditionally independent given a quantum region B. The 
proofs of theorems IV. II and IV. 21 only depend on the exis- 
tence of a joint state (positive, normalized, density opera- 
tor) for the three regions under consideration. Therefore, 
if we specialize to causal scenarios in which a joint state 
PBX\X 2 can be assigned, as discussed in i jlll Bl then the 
definitions IQCI1I IQCI3I can now be applied in any of 



these causal scenarios by substituting X\ for A, X2 for 
B and B for C. The consequence |QCI4| also applies to 
this case. 



B. Sufficient statistics 

The idea of a sufficient statistic can be motivated by a 
typical example problem in statistics: estimating the bias 
of a coin from a sequence of coin flips that are judged to 
be independent and identically distributed. In this prob- 
lem, only the relative frequency of occurrence of heads 
and tails in the sequence is relevant to the bias, whilst the 
exact ordering of heads and tails is irrelevant. The rela- 
tive frequency is then an example of a sufficient statistic 
for the sequence with respect to the bias. In this section, 
this notion is generalized to the hybrid case wherein the 
classical parameter to be estimated is replaced by a quan- 
tum region, but the data is still classical, i.e. this section 
concerns sufficient statistics for classical data with re- 
spect to a quantum region. Note that quantum sufficient 
statistics have been considered before in the literature 
[4ll - |43| , but these works are somewhat orthogonal to the 
present treatment because they concern sufficiency of a 
quantum system with respect to classical measurement 
data [HI, , or the sufficiency of measurement data with 
respect to preparation data |41| . 



1. Classical sufficient statistics 

Suppose a parameter, represented by a random vari- 
able Y , is to be estimated from data, represented by a 
random variable X. 

Definition V.3. A sufficient statistic for X with respect 
to Y is a function t of the values of X such that the 
random variable t(X) satisfies 

P(Y\t(X)=t(x))=P(Y\X = x), (52) 

for all x such that P(X = x) ^ 0. 

A sufficient statistic for A is a way of processing A 
such that the result is just as informative about Y as A 
is. In other words, learning the value of the processed 
variable t{X) allows an agent to make all the same infer- 
ences about Y that they could have made by learning the 
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value of A itself. Such processings are coarse-grainings 
of the values of A, which discard information about X, 
but only information that is not relevant for making in- 
ferences about Y. 

Since t(X) is just a function of X, it is immediate that 

Y is conditionally independent of t(X) given X, i.e. 

P(Y\X,t(X)) = P(Y\X), (53) 

This follows from the fact that we can write the joint dis- 
tribution as P(Y,X,t(X)) = P{t{X)\X)P{YX) (where 
P(t(X) — a\X — x) — S a , t (x))- Moreover, the sufficiency 
condition, eq. (I5"2"1) . implies that it is also true that Y is 
conditionally independent of X given t(X), i.e. 

P(Y\X,t(X)) = P(Y\t(X)). (54) 

This is a consequence of the fact that the joint dis- 
tribution can also be written as P(Y, X, t(X)) — 
P(t(X)\X)P{Y\t(X))P{X), where we have used eq. ([52]) . 

Definition V.4. A minimal sufficient statistic for X 
with respect to Y is a sufficient statistic that can be writ- 
ten as a function of any other sufficient statistic for X 
with respect to Y. 

A minimal sufficient statistic for X with respect to 

Y contains only that information about X that is rele- 
vant for making inferences about Y. Clearly, a sufficient 
statistic t is minimal iff 

t{x) = t(x') <S> P{Y\X =x) = P(Y\X = x'). (55) 

The following lemma is used repeatedly in our discus- 
sion of combining quantum states. 

Lemma V.5. Let P(X,Y) be a probability distribution 
over two random variables and let t(x) = P(Y\X = x), 
i.e. t is a statistic for X that takes functions of Y for 
its values. Then, t is a minimal sufficient statistic for X 
with respect to Y and 

P{Y\t{X) =t{x)) =t{x). (56) 

Proof. Clearly t satisfies eq. ([55)1 because t(x) is equal to 
P(Y\X = x) in this case. It is therefore minimally suffi- 
cient. By the conditional version of belief propagation 

P(Y\t(X)=t(x)) = 

P(Y\X = x , t(X) = t(x))P(X = x'\t{X) = t{x)). 

x' 

(57) 

Since t is a sufficient statistic, A is conditionally inde- 
pendent of t(X) given X, so this reduces to 

P(Y\t(X)=t(x)) = 

^2 P(Y\X = x')P{X = x'\t(X) = t(x)). (58) 

x' 



The term P(X = x'\t(X) = t(x)) is only nonzero for 
those values x' such that t(x') = t(x) and all such values 
satisfy P(Y\X = x') = P(Y\X = x). Therefore, 

P(Y\t(X)=t(x)) = 
P{Y\X = x) Y P(X = x'\t(X) =t(x)). (59) 

{x'\t(x')=t(x)} 

However, E { x'\t(x')=t(x )} P(* = = *(*)) = 

J2 X , P(X = x'\t(X) = t{x)) = 1, since P(X = x'\t(X) = 
t(x)) is zero when t(x') ^ t{x) and it is a conditional 
probability distribution. Hence, 

P(Y\t(X) — t(x)) = P(Y\X = x) (60) 
= t[x). (61) 

□ 

Eq. looks superficially similar to Lewis' Principal 
Principle [48j , which states that when you know that the 
objective chance of an event takes a particular value then 
you should assign that value as your subjective probabil- 
ity for that event. However, eq. (|56[) is not a statement 
about objective chances. Its interpretation is entirely in 
terms of subjective probabilities. Suppose P(X, Y) is 
your subjective probability distribution for X and Y and 
you announce this to me. I then go and observe A, find- 
ing that it has the value x. If I then tell you that the 
subjective probability distribution that you would assign 
to Y if you knew the value of A that I have observed is 
Q(Y), and you believe that I am being honest, i.e. that I 
have computed Q{Y) = P(Y\X — x) from your subjec- 
tive probability distribution and this is what I am report- 
ing back to you, then you have learned that t(X) = Q 
and eq. (|56p says that your posterior probability distri- 
bution for Y should now be Q(Y). 

2. Hybrid sufficient statistics 

Recall that if A B is a hybrid region then conditional 
density operators Pb\x ar e of the form 

Pb\x = X! \ x ) ( x \x ® Pb\x=x, (62) 

X 

where the operators Pb\x=x are normalized density op- 
erators on Hb ■ As in the classical case, the idea of suffi- 
ciency is to find a statistic for A with fewer values than 
A that still allows the conditional density operator to be 
reconstructed. In order to do this, it is only necessary 
to know which density operator Pb\x=x a value of A 
corresponds to, and there may be fewer distinct density 
operators than values of A. This motivates the following 
definition. 

Definition V.6. A sufficient statistic for A with respect 
to the quantum region B is a function t of the values of 
A such that the random variable t(X) satisfies 

PB\t(X)=t{x) = Pb\x=x, (63) 
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for all x such that px=x 7^ 0. 

This definition captures the notion that learning the 
value of the processed variable t(X) allows an agent to 
make all the same inferences about the quantum region 
B that they could have made by learning the value of X 
itself. 

Since t(X) is just a classical processing of X (specif- 
ically, p t (x)=a\x=x = £a,t(s))j we can introduce a joint 
state on the composite system BXt(X) as discussed in 
CnTEfl via 

PBXt(X) = Pt{X)\XPBX, (64) 

As one can easily verify, this state satisfies the analogous 
conditional independence relations to those that hold in 
the classical case. Specifically, B and t(X) are condition- 
ally independent given X, 

Pb\xi(x) = Pb\Xi (65) 

and because t(X) is a sufficient statistic for X with re- 
spect to B, it is also the case that B and X are condi- 
tionally independent given t(X), 

PB\Xt(X) = PB\t(X), (66) 

as can be seen by noting that the joint state can also be 
written as p B xt(x) = Pt(x)\xPB\t(x)Px if one makes use 
of eq. flB3). 

Definition V.7. A minimal sufficient statistic for X 
with respect to a quantum region B is a sufficient statistic 
that can be written as a function of any other sufficient 
statistic for X with respect to a quantum region B. 

It follows that minimal sufficiency is equivalent to 

i(x)=t(x') <^ p B \x=x = Pb\x= x >- (67) 

We will also need an analog of lemma IV. 51 
Lemma V.8. Let pxs be the state of a hybrid region 
XB and let t(x) = pb\x=x, i«e. t is a statistic for X that 
takes quantum states on B for its values. Then, t is a 
minimal sufficient statistic for X with respect to B and 

PB\t{X)=t{x) = t(x). (68) 

Proof. The statistic t satisfies eq. (1571) because t(x) is 
equal to Pb\x=x- It is therefore minimally sufficient. By 
the conditional version of belief propagation 

PB\t(x)=t(x) = Trx (pB\xt(x)=xPx\t(x)=t(x)) ■ (69) 

Since t is a sufficient statistic, B is conditionally inde- 
pendent of t(X) given A, so this reduces to 

pB\t{x)=t{x) = Trx (pB\xPx\t(x)=t(x)) ■ (70) 

However, px=x'\t(X)=t(x) is only nonzero for those val- 
ues x' such that f(x') = f(x) and all such values satisfy 
Pb\x=x> = Pb\x=x- Therefore, 

PB\t(X)=t(x) = 

PB\X=x X! PX=x'\t(X)=t(x)- (71) 

{x'\t(x')=t(x)} 



However, E{x'|t(x')=t(*)} Px=x'\t(x)=t(x) 

Tr x (px\t(x)=t(x)) = 1, since Px=x'|i(X)=t(x) is zero 

when i(x') 7^ i(x) and px\t(X) is a conditional state. 

Hence, 

PB\t(X)=t{x) — Pb\x=x (72) 
= t(x). (73) 

□ 

VI. QUANTUM STATE IMPROVEMENT 

State improvement is the task of updating your state 
assignment in the light of learning another agent's state 
assignment. It is the simplest example of a procedure 
for combining different states. We adopt the approach 
of treating the other agent's state assignment as data 
and conditioning on it. In the classical case, this idea is 
usually attributed to Morris |49| . 

A. General methodology for state improvement 

Classically, suppose a decision maker, Debbie, assigns 
a prior state Po(Y) to the variable of interest, Y. Debbie 
may have little or no specialist knowledge about Y, in 
which case her prior would be something like a uniform 
distribution. In order to improve the quality of her de- 
cision, she consults an expert, Wanda, who reports her 
opinion in the form of a state Q±(Y). Assuming that 
Debbie does not have the expertise to assess the data 
and arguments by which Wanda arrived at her state as- 
signment, the summary Q\(Y) is all she has to go on. 

In order to improve her state assignment by Bayesian 
conditioning, Debbie has to treat Wanda's state assign- 
ment as data. This means that she has to construct a 
likelihood function Pq(R\Y), where R is a random vari- 
able that ranges over all the possible state assignments 
that Wanda might report. Since R ranges over a space 
of functions, there may be technical difficulties in defin- 
ing a sample space for it, but in practice R can usually 
be confined to well parameterized families of states, e.g. 
Gaussian states or a finite set of choices. In assigning 
her likelihood function, Debbie has to take into account 
factors such as Wanda's trustworthiness, her accuracy 
in making previous predictions, and so forth. Assuming 
that Debbie can do this, she can then update her prior 
state via Bayes' theorem to obtain 

Pa{Y]R = Ql) = r°(« = W)Pf\ (T4) 

Po{K = Lji) 

where P (R - Qt) = £ y P (R = Qi|F)P (F). 

Turning to the quantum case, the situation is precisely 
the same except that we are now dealing with hybrid 
regions and the quantum Bayes' theorem. Specifically, 
Debbie is now interested in a quantum region B, to which 
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she assigns a prior state p^P , and Wanda announces her 

expert state assignment <Jg . Debbie treats Wanda's an- 
nouncement as data and constructs a classical random 
variable R that takes Wanda's possible state assignments 
as values. Constructing a sample space for all possi- 
ble states is again technically subtle, but in practice at- 
tention can be restricted to well-parameterized families. 
Debbie's likelihood is now a hybrid conditional state p^) B 
and she updates her prior state assignment via the hybrid 
Bayes' theorem to give 





r ' 

B\R= 



-P (0) 



>\B 



Pf 



(75) 



o {0) o {0) 
p r=*^\b Pb 



where p^ m = Tr, 

Note that the same methodology can be applied when 
Debbie consults more than one expert: Wanda, Theo, 
etc. Debbie simply has to construct a likelihood func- 
tion P(Ri, i?2, • ■ • 1^0 in the classical case or a likelihood 
operator /9ij 1 H 2j ...is m the quantum case, where R\ rep- 
resents Wanda's state assignment, i?2 represents Theo's 
state assignment, etc. She then applies the appropriate 
version of Bayes' theorem to condition on the state as- 
signments that the experts announce. This procedure is 
used in our approach to the pooling problem, discussed 

in CEED 



B. The case of shared priors 

Eqs. ([74)) and (|75|) are the general rules that Debbie 
should use to improve her state assignment, but in prac- 
tice it can be difficult to determine the likelihoods for 
R needed to apply them. However, the rules can sim- 
plify drastically in some situations. In particular, if Deb- 
bie and Wanda started with a shared prior for F or B, 
Wanda's state differs from Debbie's due to having col- 
lected more data, and Debbie is willing to trust Wanda's 
data analysis, then the rules imply that Debbie should 
just adopt Wanda's state assignment wholesale. 

Note that, in both the objective and subjective ap- 
proaches, starting out with shared priors is an ideal- 
ization. In the objective approach this is because it is 
unlikely that Debbie and Wanda have exactly the same 
knowledge about the region of interest, and in the sub- 
jective approach this is because their prior beliefs might 
simply be different. Nevertheless, in the objective ap- 
proach we can always imagine a (possibly hypothetical) 
time in the past at which Debbie and Wanda had exactly 
the same knowledge and, provided Debbie's knowledge is 
a subset of Wanda's current knowledge, the result still 
follows. This argument does not apply in the subjective 
case, but there are still circumstances in which the ideal 
of shared priors is a good approximation. 

Consider first the classical case. Debbie and Wanda 
share a prior state assignment Pq(Y) — P\{Y) = P(Y) 



for the variable of interest. Wanda then obtains some 
extra data in the form of the value x of some random 
variable X that is correlated with Y. Before learning the 
value of X, Wanda adopts a likelihood model for it, given 
by conditional probabilities P(X\Y), and we assume that 
Debbie agrees with this likelihood model. Upon acquir- 
ing the value x of X, Wanda updates her probabilities 
to Qi(Y) — P(Y\X — x), which can be computed from 
Bayes' theorem, and then she reports Qi(Y) to Debbie. 
In other words, Debbie learns that R — Qi and she must 
condition on this data to obtain her improved state as- 
signment P(Y\R = Qi). 

Proposition VI. 1. If Debbie and Wanda share a prior 
state assignment P(Y) and likelihood model P{X \Y) for 
the data collected by Wanda, then Debbie 's improved 
state is P(Y\R = Q x ) = Qi(Y), where Qi(Y) is Wanda's 
updated state assignment. 

Proof. Because Debbie and Wanda have a shared prior 
and likelihood assignment, the variable R is simply 
R(x) = P(Y\X = x), where P(Y\X = x) is the prob- 
ability distribution that Debbie would assign if she knew 
the value of X. Lemma TV.5I then implies that P(Y\R = 

Qi) = Qi. □ 

Note that Aumann [5(| has argued that there is a 
unique posterior that objective Bayesians ought to as- 
sign when their state assignments are common knowl- 
edge. The above theorem is a special case of this in which 
the unique state can be easily computed. 

In the quantum case, the argument proceeds in precise 
analogy. Debbie and Wanda start with a shared prior 
state Pb for region B. Wanda announces her state as- 
signment 0jg , which can be represented as the result of 
conditioning B on the value a; of a random variable X, 
i.e. a B = Pb\x=x- We assume that Debbie and Wanda 
agree upon the likelihood operator px\B f° r A. Debbie 
then has to compute her improved state p B ^ R _ a w ■ 

Proposition VI. 2. If Debbie and Wanda share a prior 
state assignment p B and likelihood operator px\B f or the 
data collected by Wanda, then Debbie's improved state is 
(0 B|fl-CT <1> = B > w h ere "b * s Wanda's updated state 
assignment. 



The proof is just the obvious generalization of the proof 
of theorem IVI.ll making use of lemma IV.8I instead of 
lemma IV. 51 



C. Discussion 

Although our results show that state improvement is 
trivial in the case of shared priors, eqs. ([74"]) and (f75)) 
are still applicable when Debbie and Wanda do not share 
prior states and, in that case, they give nontrivial results. 
The analysis of such cases is a lot more involved, so we 
do not consider any examples here. 
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In the classical case, the general methodology leading 
to eq. (|74p can be criticized. It is an onerous require- 
ment for Debbie to be able to articulate a likelihood for 
all possible state assignments that Wanda might make. 
This criticism is mitigated by the shared priors result, 
which shows that, at least in this case, the likelihood 
model need not be specified in detail. Such simplifica- 
tions might also occur in other models that do not de- 
pend on shared priors. In any case, this criticism is not 
particularly unique to state improvement, since it can 
be leveled at Bayesian methodology in general. It is al- 
ways a heavy requirement for an agent to specify a full 
probability distribution over all the variables of interest. 
For this reason, alternative Bayesian theories have been 
developed with less onerous requirements, such as the re- 
quirement to specify expectation values rather than full 
probability distributions [ljl [H| ■ 

A criticism that is more specific to state improvement 
is that the beliefs that Debbie uses to determine Pq(Y) 
might be correlated with the beliefs that she uses to de- 
termine the likelihood Po(R\Y), e.g. Debbie might be 
biased towards believing that Wanda will report states 
that are concentrated on values of Y that Debbie her- 
self believes are likely. A generalization that takes these 
correlations into account has been proposed (52l |. 

Every criticism leveled against the classical methodol- 
ogy also applies to the quantum case and, no doubt, the 
proposed classical generalizations could be raised to the 
quantum level by applying the methods outlined in this 
paper. This is not done here because it is not our goal to 
say the final word on quantum state improvement, but 
only to point out that there is no need to reinvent the 
wheel when studying the quantum case because classical 
methods can be easily adapted using the formalism of 
conditional states. 

Finally, note that quantum state improvement has pre- 
viously been considered by Her but [34j ]. who adopted 
an ad hoc procedure based on closeness of Debbie and 
Wanda's states with respect to Hilbert-Schmidt distance. 
It would be interesting to see if Herbut's rule can be 
derived using Bayesian methodology under a set of rea- 
sonable assumptions that Debbie could make about how 
Wanda arrived at her state assignment. 



VII. QUANTUM STATE POOLING 

The problem of state pooling concerns what happens 
when agents who each have their own state assignments 
want to make decisions as a group. To do so, they need to 
come up with a state assignment that accurately reflects 
the views of the group as a whole. 

In an ideal world, the agents would first reconcile 
their differences empirically so that everyone agrees on a 
common state assignment. The discussion of subjective 
Bayesian compatibility shows that it is possible for this 
to happen if their states satisfy the BFM compatibility 
criterion. Furthermore, as a consequence of the classical 



and quantum de Finetti theorems [H, [20|, [H, [HJ , if the 
agents can construct an exchangeable sequence of experi- 
ments then their states can be expected to converge in the 
long run by application of Bayesian conditioning. Nev- 
ertheless, it is not always possible to collect more data 
before a decision has to be made and, for the subjective 
Bayesian, there is also the question of how to combine 
sharply contradictory beliefs that do not satisfy compat- 
ibility criteria in the hrst place. 

The goal of this section is to provide a general method- 
ology for quantum pooling based on applying the prin- 
ciples of quantum Bayesian inference, similar to the ap- 
proach to state improvement developed in fjVTJ In the 
case of shared priors, we also derive a specific pooling rule 
from this methodology that was previously proposed by 
Spekkens and Wiseman 10]. However, before embarking 
upon this discussion, it is useful to take a step back and 
look at the basic requirements for pooling and some of 
the specific pooling rules that have been proposed in the 
classical case. 



A. Review of pooling rules 

One reasonable requirement for a pooling rule is that 
the pooled state should be compatible with each agent's 
individual state assignment. If this is so then each agent 
is assured that it is possible for them to be vindicated by 
future observations. This is because subjective Bayesian 
compatibility guarantees that, for each agent, it is pos- 
sible that data could be collected that would cause the 
pooled state and the agent's individual state to become 
identical upon Bayesian conditioning. 

Consider the classical case where n agents assign states 
Qi(Y), Q2(Y), . . . Q n (Y) to a random variable Y. A lin- 
ear opinion pool is a rule where the pooled state Qi; n is 
of the form 

n 

qUy) = J2 w jQi( y )> ( 76 ) 

where < Wj < 1 and 53j=i w j = 1- The weight Wj can 
be thought of as a measure of the amount of trust that 
the group assigns to the jth agent. The state Qn n {Y) 
is BFM compatible with every Qj(Y) because eq. ([76)) 
is an ensemble decomposition of Qrm(Y) in which each 
agent's state appears. A linear opinion pool is typically 
less sharply peaked than the individual agents' assign- 
ments. In particular its entropy cannot be lower than 
that of the lowest entropy individual state. It may be ap- 
propriate to use it as a diplomatic solution. Indeed, this 
sort of pooling rule may be applied even if the agents' 
state assignments are not pairwise compatible. 

Linear opinion pools can be straightforwardly general- 
ized to the quantum case. Specifically, if n agents assign 
states <Jg\ . . ■O'^ to a quantum region B, then a 
quantum linear opinion pool is a rule where the pooled 
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state (Tg 11 ^ is of the form 



(77) 



where < Wj < 1 and X)J=i w j = 1- Similar remarks 
apply to this as to the classical case. 

Classically a multiplicative opinion pool 62] is a rule 
whereby the pooled state is of the form 



where c is a normalization constant, 



(78) 



(79) 



Multiplicative pools typically result in a pooled state that 
is more sharply peaked than any of the individual agent's 
states. Normalizability implies that multiplicative pools 
can only be applied to states that arc jointly compatible, 
meaning that there is at least one value y of Y such that 
Qj(Y = y) > for all j. Any such value has nonzero 
weight in Q m uit(^): which guarantees that Q mu \t(Y) is 
compatible with every agent's individual assignment. As 
shown below, a multiplicative pool may be appropriate 
in an objective Bayesian framework where all the agents 
start with a shared uniform prior and the differences in 
their state assignments result from having collected dif- 
ferent data. 

In order to account for the case where the shared prior 
is not uniform, the multiplicative pool has to be general- 
ized to 



decision maker (Debbie the supra-Bayesian) would as- 
sign, where Rj is a random variable that ranges over 
all possible state assignments that the jth agent might 
make; and a prior Pq{Y), which can often just be taken 
to be the uniform distribution or a shared prior that 
the agents may have agreed upon at some point in the 
past before their opinions diverged. They can then up- 
date P (Y) to P (r|i?! = Q U R 2 = Q 2 ,...,Rn = Q n ) 
via Bayesian conditioning and use this as the pooled 
state Qsupra(y)- Pooling then becomes just an appli- 
cation of the state improvement method discussed in 
the previous section. In the quantum case, the equiv- 
alent ingredients are a hybrid likelihood p RlR2 R | S and 
a prior quantum state and then the pooled state 

, , which can be 



is a 



(supra) 
B 



r B\R 1 =<j ( g ) ,Rz=a K g> ,...,R n =t _ 

computed from quantum Bayesian conditioning. 

Admittedly, it might be a pretty tall order to expect 
the agents to be able to act together as a fictional supra- 
Bayesian Debbie, but this method does allow conditions 
under which the different pooling rules should be used 
to be derived rigorously, which in turn gives insight into 
when they might be useful as rules-of-thumb more gen- 
erally. It also has the advantage that it allows quantum 
generalizations to be derived unambiguously, since the 
necessary tools of quantum Bayesian inference have been 
developed in [l[ and the preceding sections. In particular, 
it resolves the ambiguity surrounding the correct quan- 
tum generalization of the multiplicative opinion pool. 

To illustrate this, we show that, in the case of shared 
priors, the supra-Bayesian approach can be used to moti- 
vate the two-agent case of the quantum generalized mul- 
tiplicative pool with wo = — 1,W\ = l,u>2 = 1. 



QgmultCn =cY[Qj(Yy 



(80) 



where the extra state Qo(Y) represents the shared prior 
information. 

Unlike with linear pools, it is not immediately obvious 
how to generalize multiplicative pools to the quantum 
case because the product of states in eq. ([501) does not 
have a unique generalization due to non-commutativity. 



B. General methodology for state pooling 

As with the other problems tackled in this paper, pool- 
ing rules should be derived in a principled way from 
the rules of Bayesian inference, rather than simply be- 
ing posited. One way to do this to adopt the supra- 
Bayesian approach. This works by requiring the group 
of agents to put themselves in the shoes of Debbie the 
decision maker who we met in the state improvement 
section. Specifically, in the classical case, acting to- 
gether, they are asked to come up with a likelihood 
function Pq(Ri, R2, ■ ■ ■ , Rn\Y) that they think a neutral 



C. The case of shared priors 

For simplicity, we specialize to the case of a group of 
two agents, Wanda and Theo. First consider the clas- 
sical case where Wanda and Theo have individual state 
assignments Qi(Y) and Q2(Y). We assume that Wanda 
and Theo started from a shared prior P(Y), which can 
be used as Debbie's prior Po(Y) = P(Y) in the supra- 
Bayesian approach, and that the current differences in 
Wanda and Theo's state assignments are due to hav- 
ing collected different data. The additional data avail- 
able to Wanda and Theo are modeled as values x\ and 
X2 of random variables X\ and X2 respectively. Before 
learning the values of X\ and X2, Wanda and Theo 
assigned likelihood functions P(Xi\Y) and P(X 2 \Y) 7 
which, when combined with the prior P(Y), determine 
their current state assignments via Bayes' theorem, i.e. 
QjiY) = PiYlXj = x d ). 

We assume that it is possible to assign a joint likelihood 
function P(X 1 ,X 2 \Y), such that P{X X \Y) and P(X 2 \Y) 
are obtained by marginalization. It is unrealistic to think 
that Wanda and Theo must specify this joint likelihood 
in detail. Fortunately, in order to obtain a generalized 
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multiplicative pool, they need only agree on some of its 
broad features. In particular, if they agree that minimal 
sufficient statistics for X\ and X 2 are conditionally inde- 
pendent given Y, then supra-Bayesian pooling gives rise 
to a generalized multiplicative pool. 

Theorem VII. 1. // a minimal sufficient statistic for 
X\ with respect to Y and a minimal sufficient statis- 
tic for X2 with respect to Y are conditionally inde- 
pendent given Y , then the supra-Bayesian pooled state 
Qsupra(^) = Po(Y\Ri = Qi,R 2 = Q2) is given by 

Qi(Y)Q 2 (Y) 



Qsupra (Y) 



P(Y) 



(81) 



where c is a normalization factor, independent ofY. 

Comparing this result with eq. ([50)1 shows that this 
is a generalized multiplicative pool with Qq(Y) — P(Y), 
wq = — 1, w\ = 1 and w 2 = 1. In the special case of a 
uniform prior, this reduces to 

QsupraCH = c'Q 1 (Y)Q 2 (Y), (82) 

where d is a different normalization constant. This is a 
multiplicative pool with ui\ = 1 and w 2 = 1. 



Proof of theorem I VII. 11 By definition, the supra- 
Bayesian pooled state is Q S upra(^) = P(Y\R\ — 
Qi,R 2 = Q2) and this can be computed from the prior 
P(Y) and the likelihood P(R 1 ,R 2 \Y) via Bayes' theo- 
rem. Now, Rj can be thought of as a function-valued 
statistic for Xj via Rj(xj) = P(Y\Xj = Xj). It is a 
minimal sufficient statistic with respect to Y because 
Rj( X j) = Rjix'j) iff IV X, = Xj ) = P(Y\X, = x l j ). 
By assumption, there exist minimal sufficient statistics 
for X\ and for X 2 that are conditionally independent 
given Y. However, any minimal sufficient statistic is a 
bijective function of any other minimal sufficient statistic 
for the same variable, so if any pair of such statistics are 
conditionally independent then they all are. Therefore, 
Ri and R 2 arc conditionally independent given Y, and 
so bvlClil 

P(R U R 2 \Y) = P(R 1 \Y)P(R 2 \Y). (83) 

The terms P(Rj\Y) can be inverted via Bayes' theorem 
to obtain P(Rj\Y) = P{Y\R j )P{R j ) / 'P(Y), which gives 

, , , ,P{Y\Ri)P(Y\R 2 ) , . 
P(R 1 ,R 2 \Y) = P(R 1 )P(R 2 ) [ I «Z , (84) 

Using Bayes' theorem again in the form P(Y\R±, R 2 ) = 
P(R 1 ,R 2 \Y)P(Y)/P(R U R 2 ) gives 

v 1 u 21 P(Ri,R 2 ) P(Y) ' v ' 

which, upon substituting the announced values of R\ and 
R 2 , gives 



MY) 



P(R 1 = Q 1 )P{R 2 = ga) 
P{Ri = Qi,R 2 = Q 2 ) 

FpVh = Qi)P(Y\R 2 = 
x PIT) 



(86) 



The term c = [P(R 1 = Q 1 )P{R 2 = Q 2 )} /P(Rt = 
Qi,R 2 = Q 2 ) is independent of Y, so it can 
be determined from the normalization constraint 
J2y Qsupra(^) = I- Also, lemma IV31 implies P(Y\Rj = 
Qj) = Qj(Y), so we have 



Qsupra (Y) 



Qi(Y)Q 2 (Y) 
■ P(Y) ' 



as required. 



(87) 
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In the quantum case, Wanda and Theo have individual 
state assignments cr^ and o-g\ Again, any differences 
in Wanda and Theo's state assignments are assumed to 
arise from having collected different data, before which 
they agreed upon a shared prior ps, which can be used 
as Debbie's prior state = ps in the supra-Bayesian 
approach. 

Again, we assume that Wanda and Theo have ob- 
served values X\ and x 2 of random variables X\ and X 2 , 
with likelihood operators, Px x \B an d Px 2 \b- Wanda and 
Theo's states result from conditioning the shared prior 
on their data using these likelihoods. We assume that 
there is a joint likelihood px 1 x 2 \Bi of which Wanda and 
Theo's likelihoods are marginals. Wanda and Theo need 
not agree on the full details of this joint likelihood, only 
that minimal sufficient statistics for X\ and X 2 satisfy 
|QCI4[ which is slightly weaker than conditional inde- 
pendence. We then have 

Theorem VII. 2. If a minimal sufficient statistic t\ for 
X\ with respect to B and a minimal sufficient statistic t 2 
for X 2 with respect to B satisfy 

Pt 1 (X 1 )t 2 (X 2 )\B = Pt 1 (X 1 )\BPt 2 (X 2 )\B, (88) 

then the supra-Bayesian pooled state cr^ upra ' 1 = 



a 



(0) 

B\R^a { ^ ,R 2 =a 



(2) is given by 



(supra) 



(1) -1 (2) 



' b -Wb'Pb^b' ( 89 ) 

where c is a normalization factor, independent of B. 

Eq. is the quantum generalization of the general- 
ized multiplicative pool with Wq = — 1, w± = 1, w 2 = 1. 
Despite appearances, this expression is symmetric under 
exchange of 1 and 2. This follows from the condition (|88[> . 
which implies that pt 1 (x 1 )\B and Pt 2 (x 2 )\B must commute. 
When pb is a maximally mixed state, eq. (j89]l reduces to 



(supra) _ , (1) (2) 
"B — 6 °B "B ' 



(90) 



where d is a different normalization constant. This is a 
quantum generalization of the multiplicative pool with 
wi = 1, w 2 = 1. 

Although conditional independence of the minimal 
sufficient statistics was assumed in the classical case, 
eq. (|88p is strictly weaker than conditional independence, 
as explained in j ]V Al 
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Proof of theorem I VII. S\ By definition, the supra- 
Bayesian pooled state is er^ upia ' 1 = P B i Rl=IT w r 2=(7 w 
and this can be computed from the prior ps and the 
likelihood Pr 1 r 2 \b ym Bayes' theorem. Each Rj is an 
operator-valued statistic for Xj via Rj(xj) — Pb\x=x- 
They are minimal sufficient statistics with respect to B 
because Rj(xj) = Rjix'j) iff pB\x i = Xj = Ab|x,=^- B Y 
assumption, there exist minimal sufficient statistics, t\ 
and t 2 , for Xi and X 2 that satisfy 

Pt 1 (X 1 )t 2 (X 2 )\B = Pt 1 (X 1 )\BPt 2 (X 2 )\B, (91) 

but since any minimal sufficient statistic is a bijective 
function of any other minimal sufficient statistic for the 
same variable, R± and R 2 must also satisfy 



PRxR 2 \b — Pr ± \bPr 2 \b- 



(92) 



The terms Pr\ b can be inverted via Bayes' theorem to 
obtain p R . ]B = Pb\r 3 * (p^Pb 1 ), which S ives 

PR ± R 2 \B = [pB\Rj. * (PRiPb 1 )] [PB\R y * (PRiPb 1 )] ■ 

(93) 

Since R\ and R 2 are classical, the operators pr commute 
with everything else and so expanding the ★-products 
gives 

i _ 1 _i 

Pr 1 r 2 \b = PRiPr 2 Pb 2 Pb\r.\Pb Pb\r 2 P b 2 ■ ( 94 ) 

Using Bayes' theorem again in the form Pb\r ± r 2 = 

Pr.\R 2 \b * (pbPrIrJ and noting that p Rl R 2 commutes 
with everything else gives 



Pb\r x r 2 - PRiPr 2 Pr\r 2 (pb\r 1 P b 1 Pb\r 2 



(95) 



which, upon substituting the announced values of R\ and 
i? 2 , gives 



(supra) 
J B 



P„ <1)Pd ( 2 ) 

r R.i=a„ ^ R 2 =a„' 



R 1 =a { „> ,R 2 =a\ 



x Pb\r 1= ^Pb Pb\r 2 =4 



(96) 



Pry <!)Pd ( 2 ) 



/p„ (i) „ (2) is in- 



The term c 

dependent of B, so it can be determined from the normal- 
ization constraint Tr^ (^ supra) ) = 1. Also, lemma Efi] 

implies p 



(i) 

W) = o-g i so we have 



B\R 3 



(supra) 
T B 



(1) -1 (2) 
ccr B P B a B > 



as we set out to prove. 



(97) 
□ 



D. Comparison to other approaches 

Quantum state pooling has been discussed previously 
in HiH Sll- Both [H and jl| propose pooling 



methodologies that seem ad hoc from the Bayesian point 
of view, but, as with Herbut's approach to improvement, 
it would be interesting to see whether they could be jus- 
tified in the supra-Bayesian approach. 

Jacobs ;28| 39] considers quantum state pooling in the 
case where Wanda and Theo arrive at their states by 
making direct measurements on the system of interest. In 
particular, he derives a generalization of the multiplica- 
tive rule that is distinct from the one we derive. From the 
perspective of the conditional states formalism, his rule 
is not a valid way of combining state assignments. The 
reason is that Jacobs takes collapse rules in quantum the- 
ory — such as the von Neumann-Luders-von Neumann 
projection postulate or its generalization to POVMs — 
as quantum versions of Bayesian conditioning, but in the 
conditional states framework, such collapse rules are ex- 
plicitly not instances of Bayesian conditioning, as argued 
in i|TTI3]and [if. 

Spekkens and Wiseman [l(| consider the case of pool- 
ing via remote measurements, wherein there is a shared 
prior state pbAxA 2 of a tripartite system and Wanda and 
Theo arrive at their differing state assignments for B by 
making POVM measurements on A\ and A 2 respectively, 
as depicted in fig. Ilbl They obtain the same generalized 
multiplicative pool that has been derived here, namely 

ca B^ P~b 1(7 b^ f° r t w0 restricted classes of states pba x a 2 - 
Both of these classes are special cases of states for which 
A\ and A 2 are conditionally independent given B. If 
Pba x a 2 satisfies this conditional independence then so 
does any hybrid state pBX t x 2 obtained by measuring 
POVMs Px x \A x ° n system A\ and Px 2 \A 2 ° n system A 2 . 
This is because the conditional mutual information can- 
not be increased by applying local CPT maps to A\ and 
A 2 . The minimal sufficient statistics for X\ and X 2 then 
also satisfy conditional independence because they are 
just local processings of X\ and X 2 . Therefore, the as- 
sumptions of theorem IVII.2I follow from this conditional 
independence. As such, the result of [l(| is seen to be a 
special case of the one derived here. 

What we have shown is that the Spekkens and Wise- 
man pooling rule holds under much weaker conditions 
than the conditional independence of A\ and A 2 given 
B. For example, it also holds for states of the form 
Pba[A> 2 ®PA'{A' 2 ' , where H A] = ?^ ®U A >j and A[ and A 2 
are conditionally independent given B. For such states, 
A\ and A 2 are not conditionally independent given B 
whenever pa'{A' 2 ' is a correlated state, but A" and A 2 
contain no information about B, so they will not be cor- 
related with the minimal sufficient statistics for X\ and 
X 2 and consequently the minimal sufficient statistics are 
conditionally independent given B, which is sufficient to 
derive the result [631]. Of course, our results also signifi- 
cantly generalize those of [l(| because theorem I VII . 2 1 ap- 
plies to a broader set of causal scenarios than just the 
remote measurement scenario. 

Finally, it is worth pointing out that both Jacobs 
[H, [39[ and Spekkens and Wiseman [l(| adopt a pooling 
methodology that is less widely applicable than the one 
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used in the present work. In |10|, for example, a fourth 
party called Oswald (the overseer) is introduced into the 
game, in addition to the two agents and the decision- 
maker (whom they call the pooler). Before any data is 
collected, everyone shares a prior ps for the region of 
interest. In addition, Wanda, Theo and Oswald assign 
a shared prior pBX t x 2 including the data variables that 
Wanda and Theo are going to observe [64j. Oswald has 
access to both Wanda and Theo's data, i.e. he learns the 
values X\ and x 2 that Wanda and Theo observe so he 
can update his state to the posterior Pb\x 1 =x 1 ,x 2 =x 2 - It 
is then asserted that if Oswald's posterior can be deter- 
mined from the data available to Debbie, then this is 
what she should assign as the pooled state. Since Debbie 
only knows Wanda and Theo's state assignments and the 
prior pb, this is possible only if Oswald's posterior can 
be computed from these alone. 

This methodology is less widely applicable than the 
one presented here because it does not specify what to do 
if Debbie cannot determine Oswald's posterior, whereas 
ours does. In fact, there are situations in which the mul- 
tiplicative pooling rule is applicable even though Debbie 
cannot determine Oswald's posterior using the data that 
she has available. Therefore, even though the rule of 
adopting Oswald's posterior if it can be determined is 
indeed correct in the supra-Bayesian approach, requiring 
this is an unnecessary restriction and it is better to make 
do without Oswald. 

It is useful to consider how such situations can 
arise. By learning p B \x 1= x! and p B \x 2 =x 2 , Deb- 
bie learns a minimal sufficient statistic for X\ with 
respect to B and a minimal sufficient statistic for 
X 2 with respect to B and hence Debbie's posterior 
is Pb\r 1 (x 1 )=r 1 (x 1 ),r 2 {x 2 )=r 2 {x 2 ), where the function 
Rj( x j) = Pb\x=x- is the state-valued minimal suffi- 
cient statistic for Xj. This is identical to Oswald's 
posterior iff (Ri,R 2 ) happens to be a sufficient statis- 
tic for the pair (Xi,X 2 ) with respect to B, i.e. iff 

Pb\r 1 {x 1 )=r 1 {x 1 )r 2 {x 2 )=r 2 {x 2 ) = Pb\x 1 =x 1 x 2 =x 2 ■ In gen- 
eral, this is not the case, since it is only guaranteed that 
R\ and R 2 are locally sufficient for the individual data, 
i-e- Pb\r 1 (x 1 )=r 1 (x 1 ) = Pb\Xi=x! and Pb\r 2 (x 2 )=r 2 (x 2 ) = 
Pb\x 2 =x 2 > and not globally sufficient for the pair. How- 
ever, Debbie only has enough data to reconstruct Os- 
wald's posterior if they are in fact globally sufficient, that 

is, if PB\R 1 {X 1 )=R 1 {x 1 ),R 2 {X 2 )=R 2 {x 2 ) — PB\X 1 =x 1 ,X 2 =x 2 - 



Y 
Xi 

x 2 


1 1 1 1 
1 1 1 1 
10 10 10 1 


P(Y,X 1 ,X 2 ) 


4 u u 4 u 4 4 u 



TABLE III: A prior state for which Debbie cannot determine 
Oswald's prior, but for which the multiplicative pooling rule 
still holds 



A classical example suffices to show that our pooling 
rule sometimes applies even in cases where Debbie can- 
not reconstruct Oswald's posterior. Suppose Y, X\ and 
X 2 are classical bits and Oswald's prior is given by ta- 
ble IIIII With this assignment, the shared prior for Y 
is P(Y = 0) = P(Y = 1) = |. Learning the value of 
Xj on its own gives no further information about Y, i.e. 
P(Y\Xj = Xj) = P(Y), independently of the value of Xj, 
so both Wanda and Theo simply report the uniform dis- 
tribution back to Debbie. Any minimal sufficient statistic 
for Xj is trivial, consisting of just a single value, so the 
sufficient statistics for X\ and X 2 are trivially condition- 
ally independent and thus our derivation of the multi- 
plicative pooling rule holds. Unsurprisingly, in this case 
it just says that Debbie should continue to assign the uni- 
form distribution. On the other hand, knowing both the 
value of X\ and the value of X 2 is enough to determine 
Y uniquely, so Oswald's posterior is a point measure and 
there is no way that Debbie could determine it from the 
data she has available. The reason why this happens is 
that all the information about Y is contained in the corre- 
lations between X\ and X 2 , i.e. P(Y = 0\X\ = X 2 ) = 1 
and P(Y = Q\X t ^ X 2 ) = 0, and Oswald is the only 
agent who has access to this data. 



VIII. CONCLUSIONS 

In this paper, we have developed a Bayesian approach 
to quantum state compatibility, improvement and pool- 
ing, based on the principle that states should always be 
updated by a quantum analog of Bayesian conditioning. 
This improves upon previous approaches, which were 
more ad hoc in nature. Due to our use of the conditional 
states formalism, our results apply to a much wider range 
of causal scenarios than previous approaches. Indeed, the 
ability of this formalism to unify the description of many 
distinct causal arrangements explains the otherwise puz- 
zling fact that authors considering very different causal 
arrangements have found the same results. For instance, 
the compatibility criterion found by Brun, Finkelstein 
and Mermin in the case of remote measurements [24j is 
identical to the one found by Jacobs in the case of se- 
quential measurements [28j |. 

This paper only represents the beginning of a Bayesian 
approach to these problems; there is a lot of scope for fur- 
ther work. For example, it would be interesting to deter- 
mine when a quantum linear pooling rule can be derived 
from Bayesian principles, as it has been in the classical 
case [5o| . and whether the results of previous method- 
ologies for quantum state improvement and pooling can 
be reconstructed from a Bayesian point of view. How- 
ever, perhaps the most important lesson of this paper is 
that the conditional states formalism can vastly simplify 
the task of generalizing results from classical probabil- 
ity to the quantum domain. Definitions, theorems and 
proofs can often be ported almost mechanically from clas- 
sical probability to quantum theory by making use of the 



29 



appropriate analogies. Many aspects of quantum the- 
ory that might appear, by the lights of the conventional 
quantum formalism, to have no good classical analogue, 
are seen under the new formalism to be generalizations of 
very familiar features of Bayesian probability theory. As 
such, this new formalism helps us to focus our attention 
on those aspects of quantum theory that truly distinguish 
it from classical probability theory, such as violations of 
Bell inequalities, the impossibility of broadcasting, and 
monogamy constraints on correlations. 
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